[pve-devel] [PATCH pve-manager v2] api: ceph: improve reporting of ceph OSD memory usage

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Sep 1 16:00:48 CEST 2023


Am 16/08/2023 um 17:21 schrieb Stefan Hanreich:
> Currently we are using the MemoryCurrent property of the OSD service
> to determine the used memory of a Ceph OSD. This includes, among other
> things, the memory used by buffers [1]. Since BlueFS uses buffered
> I/O, this can lead to extremely high values shown in the UI.
> 
> Instead we are now reading the PSS value from the proc filesystem,
> which should more accurately reflect the amount of memory currently
> used by the Ceph OSD.
> 
> We decided on PSS over RSS, since this should give a better idea of

Who's "we"?

> used memory - particularly when using a large amount of OSDs on one
> host, since the OSDs share some of the pages.

fine for me, I'd hint that in the UI too though, e.g., using
"Memory (PSS)" as label.

> 
> [1] https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
> 
> Signed-off-by: Stefan Hanreich <s.hanreich at proxmox.com>
> ---
> 
> Changes from v1:
>  * Now returns 0 instead of null in case of stopped OSDs in order to
>  preserve backwards compatibility
> 
> 
>  PVE/API2/Ceph/OSD.pm | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 



> +	    open (my $SMAPS, '<', "/proc/$pid/smaps_rollup")
> +		or die 'Could not open smaps_rollup for Ceph OSD';

The die is missing a trailing newline, which then will result in
showing the user a rather ugly "died at line ... in .."  suffixed.

Please also include the message of what the actual error was in the
die using $! – otherwise such things, especially due to being probably
rather rare, are unnecessarily hard to debug.

Would maybe reword it a bit too "smaps_rollup" is probably a bit
odd to read for some users ^^

nit, we normally start error messages lower case – while we have no
hard style rule for that so no hard feelings, just mentioning as it
stuck out to me.

So maybe something like:

    or die "failed to read PSS memory-stat from process - $!\n";

Oh, and I would move that open + parse stuff to a private local
method, as it only crowds the API endpoint's code and might be better
off if moved to PVE::ProcFSTools or the like in the future (but we
don't use PSS anywhere else, so it can live in this module for now),
something like


my sub get_proc_pss_from_pid {
    my ($pid) = @_;

    # ...





More information about the pve-devel mailing list