[PVE-User] [Extern] - Re: "nearfull" status in PVE Dashboard not consistent
Frank Thommen
f.thommen at dkfz-heidelberg.de
Wed Sep 11 16:24:09 CEST 2024
I will have a look. However, not having real working experience with
Ceph, using an external balancer requires a "leap of faith" from my side :-)
On 11.09.24 13:52, Daniel Oliver wrote:
> The built-in Ceph balancer only balances based on PG numbers, which can vary wildly in size for several reasons.
>
> I ended up disabling the built-in balancer and switching to https://github.com/TheJJ/ceph-balancer, which we now run daily with the following parameters:
> placementoptimizer.py balance --ignore-ideal-pgcounts=all --osdused=delta --osdfrom fullest
>
> This keeps things nicely balanced from a fullness perspective, with the most important bit being ignore-ideal-pgcounts, as it allows balancing decisions outside of what the built-in balancer would decide.
>
> From: pve-user <pve-user-bounces at lists.proxmox.com> on behalf of Frank Thommen <f.thommen at dkfz-heidelberg.de>
> Date: Wednesday, 11 September 2024 at 12:00
> To: pve-user at lists.proxmox.com <pve-user at lists.proxmox.com>
> Subject: Re: [PVE-User] [Extern] - Re: "nearfull" status in PVE Dashboard not consistent
> The OSDs are of different size, because we have 4 TB and 2 TB disks in
> the systems.
>
> We might give the reweight a try.
>
>
>
> On 10.09.24 20:31, David der Nederlanden | ITTY via pve-user wrote:
>> Hi Frank,
>>
>> The images didn't work 🙂
>>
>> Pool and osd nearfull are closely related, when OSD's get full your pool
>> also gets nearfull as Ceph needs to be able to follow the crush rules,
>> which it can't if one of the OSD's gets full, hence it warns when it
>> gets nearfull.
>>
>> I see that you're mixing OSD sizes, deleting and recreating the OSD's
>> one by one caused this, as the OSD's got new weights you should be OK
>> when you reweight them.
>> You can do this by hand or using reweight-by-utilization, what you
>> prefer.
>>
>> Not quite sure about the pool sizes, but an RBD pool with a 2/3 min/max
>> rule should never be above 80%, as this gives you a nearfull pool when
>> it starts backfilling when you lose one node, or even a full pool worst
>> case, rendering the pool read only.
>>
>> Kind regards,
>> David
>
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
More information about the pve-user
mailing list