[PVE-User] ceph rebalance/ raw vs pool usage

Mark Adams mark at openvs.co.uk
Wed May 8 10:34:44 CEST 2019


Thanks for getting back to me Alwin. See my response below.

On Wed, 8 May 2019 at 08:10, Alwin Antreich <a.antreich at proxmox.com> wrote:

> Hello Mark,
>
> On Tue, May 07, 2019 at 11:26:17PM +0100, Mark Adams wrote:
> > Hi All,
> >
> > I would appreciate a little pointer or clarification on this.
> >
> > My "ceph" vm pool is showing 84.80% used. But the %RAW usage is only
> 71.88%
> > used. is this normal? there is nothing else on this ceph cluster apart
> from
> > this one pool.
> It is normal that the pools used-% is higher, then the raw-% usage of
> the cluster, because for one the bluestore OSDs (DB+WAL) occupy by
> default ~1.5GiB. And depending on the OSDs the pool resides on (class
> based rules), the amount may even further diverge.
>
> The general %-usage numbers of your cluster may not allow a recovery if
> a node or multiple OSDs will fail. Consider to put in more disks or
> reduce the data usage.


> >
> > Also, I notice some of my OSD's are out of balance. I've done some
> > reweighting using "ceph osd reweight-by-utilization" which has helped a
> > bit, but I think it needs to be tweaked some more (still some OSDs are
> over
> > 82% utilised, while some are around 65-70%).
> This is something that should be done with keeping an eye on the general
> cluster performance and how it fills the OSDs further. Could please post
> a 'ceph osd df tree'? It seems to me the cluster is uneven balanced in
> disk size and/or count.
>

I have the same size and count in each node, but I have had a disk failure
(has been replaced) and also had issues with osds dropping when that memory
allocation bug was around just before last christmas (Think it was when
they made some bluestore updates, then the next release they increased the
default memory allocation to rectify the issue) so that could have messed
up the balance.

ceph osd df tree:

ID CLASS WEIGHT    REWEIGHT SIZE    USE     AVAIL   %USE  VAR  PGS TYPE
NAME
-1       209.58572        -  210TiB  151TiB 58.8TiB 71.92 1.00   - root
default
-3        69.86191        - 69.9TiB 50.2TiB 19.6TiB 71.91 1.00   -     host
prod-pve1
 0   ssd   6.98619  0.90002 6.99TiB 5.70TiB 1.29TiB 81.54 1.13 116
 osd.0
 1   ssd   6.98619  1.00000 6.99TiB 5.49TiB 1.49TiB 78.65 1.09 112
 osd.1
 2   ssd   6.98619  1.00000 6.99TiB 4.95TiB 2.03TiB 70.88 0.99 101
 osd.2
 4   ssd   6.98619  1.00000 6.99TiB 4.90TiB 2.09TiB 70.11 0.97 100
 osd.4
 5   ssd   6.98619  1.00000 6.99TiB 4.52TiB 2.47TiB 64.67 0.90  92
 osd.5
 6   ssd   6.98619  1.00000 6.99TiB 5.34TiB 1.64TiB 76.50 1.06 109
 osd.6
 7   ssd   6.98619  1.00000 6.99TiB 4.56TiB 2.42TiB 65.31 0.91  93
 osd.7
 8   ssd   6.98619  1.00000 6.99TiB 4.91TiB 2.08TiB 70.21 0.98 100
 osd.8
 9   ssd   6.98619  1.00000 6.99TiB 4.66TiB 2.32TiB 66.76 0.93  95
 osd.9
30   ssd   6.98619  1.00000 6.99TiB 5.20TiB 1.78TiB 74.49 1.04 106
 osd.30
-5        69.86191        - 69.9TiB 50.3TiB 19.6TiB 71.93 1.00   -     host
prod-pve2
10   ssd   6.98619  1.00000 6.99TiB 4.47TiB 2.52TiB 63.92 0.89  91
 osd.10
11   ssd   6.98619  1.00000 6.99TiB 4.86TiB 2.13TiB 69.53 0.97  99
 osd.11
12   ssd   6.98619  1.00000 6.99TiB 4.46TiB 2.52TiB 63.91 0.89  91
 osd.12
13   ssd   6.98619  1.00000 6.99TiB 4.71TiB 2.28TiB 67.43 0.94  96
 osd.13
14   ssd   6.98619  1.00000 6.99TiB 5.50TiB 1.49TiB 78.68 1.09 112
 osd.14
15   ssd   6.98619  1.00000 6.99TiB 5.20TiB 1.79TiB 74.38 1.03 106
 osd.15
16   ssd   6.98619  1.00000 6.99TiB 4.66TiB 2.32TiB 66.74 0.93  95
 osd.16
17   ssd   6.98619  1.00000 6.99TiB 5.51TiB 1.48TiB 78.84 1.10 112
 osd.17
18   ssd   6.98619  1.00000 6.99TiB 5.40TiB 1.59TiB 77.24 1.07 110
 osd.18
19   ssd   6.98619  1.00000 6.99TiB 5.50TiB 1.49TiB 78.66 1.09 112
 osd.19
-7        69.86191        - 69.9TiB 50.2TiB 19.6TiB 71.93 1.00   -     host
prod-pve3
20   ssd   6.98619  1.00000 6.99TiB 4.22TiB 2.77TiB 60.40 0.84  86
 osd.20
21   ssd   6.98619  1.00000 6.99TiB 4.43TiB 2.56TiB 63.35 0.88  90
 osd.21
22   ssd   6.98619  0.95001 6.99TiB 5.69TiB 1.30TiB 81.45 1.13 116
 osd.22
23   ssd   6.98619  1.00000 6.99TiB 4.67TiB 2.32TiB 66.79 0.93  95
 osd.23
24   ssd   6.98619  0.95001 6.99TiB 5.74TiB 1.24TiB 82.20 1.14 117
 osd.24
25   ssd   6.98619  1.00000 6.99TiB 4.51TiB 2.47TiB 64.59 0.90  92
 osd.25
26   ssd   6.98619  1.00000 6.99TiB 4.90TiB 2.09TiB 70.15 0.98 100
 osd.26
27   ssd   6.98619  1.00000 6.99TiB 5.39TiB 1.59TiB 77.21 1.07 110
 osd.27
28   ssd   6.98619  1.00000 6.99TiB 5.69TiB 1.29TiB 81.47 1.13 116
 osd.28
29   ssd   6.98619  1.00000 6.99TiB 5.00TiB 1.98TiB 71.63 1.00 102
 osd.29
                      TOTAL  210TiB  151TiB 58.8TiB 71.92

MIN/MAX VAR: 0.84/1.14  STDDEV: 6.44



>
> >
> > Is it safe enough to keep tweaking this? (I believe I should run ceph osd
> > reweight-by-utilization 101 0.05 15) Is there any gotchas I need to be
> > aware of when doing this apart from the obvious load of reshuffling the
> > data around? The cluster has 30 OSDs and it looks like it will reweight
> 13.
> Your cluster may get more and more unbalanced. Eg. making a OSD
> replacement a bigger challenge.
>
>
It can make the balance worse? I thought the whole point was to get it back
in balance! :)


> --
> Cheers,
> Alwin
>

Cheers,
Mark


More information about the pve-user mailing list