[pmg-devel] [PATCH pmg-api] config: adjust max_filters calculation to reflect current memory usage
Stoiko Ivanov
s.ivanov at proxmox.com
Mon Jan 15 19:05:53 CET 2024
hi,
a bit late to the discussion on-list - had a few chats with Dominik
off-list and did some minimal testing:
On Fri, 12 Jan 2024 11:13:57 +0100
Thomas Lamprecht <t.lamprecht at proxmox.com> wrote:
> Am 10/01/2024 um 12:56 schrieb Markus Frank:
> > One pmg-smtp-filter process uses at least 220 MiB.
> > When having 100000 rules one process can take up to 330 MiB.
>
> That's probably talking about RSS here, or? That would be rather useless as
> it re-counts the memory used by shared libraries, which bloats the number,
> as they're actually only loaded once in memory.
>
> What is the newer PSS (Proportional Set Size) metric in that case? As that
> would be a better metric due to actually accounting for the proportional use
> of shared libraries.
>
> For example, use the following bash one-liner to get both, PSS and RSS of
> each pmg-smtp-filter processes:
>
> for pid in $(pidof pmg-smtp-filter); do printf "PID %s: " $pid; awk '/Pss:/{ pss += $2 } /Rss:/{ rss += $2 } END { print "PSS =", pss, " RSS =", rss }' "/proc/$pid/smaps"; done
TIL: PSS/USS - thanks! (also for the one-liner :)
>
> Here, on a pretty much idle setup with only a handful of rules, I get:
>
> PID 405810: PSS = 84700 RSS = 225000
> PID 405809: PSS = 84714 RSS = 225000
for comparison - 2 more loaded productive instances (although also with a
quite small ruleset) I have access to:
PID 2908376: PSS = 114567 RSS = 227528
PID 2908355: PSS = 115023 RSS = 227776
and
PID 788678: PSS = 65242 RSS = 217564
PID 788600: PSS = 69126 RSS = 220656
PID 788507: PSS = 102765 RSS = 227596
>
> As PSS is what matters here, the 84.7 MB are is quite in line with the 120 MB
> $servermem, at least for my (underused) setup.
I tried to get a few more filter processes to spawn (while `watch -n2` the
one-liner from above), by:
* stopping postfix on the sending system
* queueing 250 (also 1250) tiny mails to one recipient there
* starting postfix (so it starts to drain the queue at once)
to get more than 2-3 filter processes I had to either add a sleep to the
processing (e.g. at analyze_virus_clam), or send a larger mail (with a pdf
attachment that gets handed through pdf2text).
With that (and a small ruleset, without huge numbers of objects) the
PSS-size was between:
31967 and 62725 (for 20 parallel processes, although sometimes some pids did not yield
one or the other metric - probably while being started/torn-down)
numbers were similar for both types of mail (small text/plain only, and
with a ~156k pdf attachment).
>
> In the get_max_filters calculation I'd rather look at the accounting for the
> system baseline memory usage through subtracting 512 MB, as that is rather
> way to low nowadays, a clamav alone takes up 1.2 - 1.5 GB.
Would also be my first go-to for adaptation (based on guess-work and
watching top/htop output of a few PMG instances in our support-channels)
I think I can't remember any other (than the one that lead to this
patch) case of OOM-killed pmg-smtp-filter processes (for any system with >
2Gb memory)
Neither do I recall recommending someone to manually tweak the
max_filter setting - so while adaptation might make sense - I think even
the current code works well in practice.
We do not need to cover all possible scenarios with it - as if someone
adds tons of signatures to clamAV (as described in the thread from the
commit-message), or avast, or many SpamAssassin plugins - they always have
the option to limit the filter-processes in the config).
>
> Maybe turn that up first depending on physical_memory, i.e., for < 2 GB I'd
> keep it as is, otherwise deduct something like 1.5 GB.
>
> Then check the actual memory growth per added filter process via checking
> the PSS sizes, if huge setups then use 200 MB we could increase that a bit,
> but it probably won't need to be the 300 MB of your patch. And we always can
> make $servermem dependent of available memory size too, e.g., assume bigger
> rule sets due to bigger resources available, like > 4 GB (or 8 GB) memory.
>
> i.e., having a three-branch if here to cover the cases for
> - "low-memory but might work for small setups"
> - "ok'ish memory but needs some special tuning"
> - "more than the minimal recommended amount of memory"
>
> if ($memory < 2000) {
> warn "low amount of system memory installed, minimum requirement is 2 GB, recommended is 4+ GB\n".
> $base = $memory > 1536 ? 1024 : 512;
> $servermem = 120;
> } elsif ($memory < 4096) {
> $base = 1500;
> $servermem = 150;
> } else {
> $base = 2500;
> $servermem = 200;
> }
>
> The $base and $servermem values are just guesstimated without to much
> thought and can surely be better chosen (PSS size of huge setups would be
> required for that).
>
> A complete different alternative:
> rewrite the filter in rust and decimate the memory usage making it actual
> more performant and allowing really higher throughput.
For this I usually handwave around - pointing to SpamAssassin being the
largest memory-hog (and run-time and cpu) for the complete pmg-smtp-filter
- but I never actually checked if that's true - so I tried removing
SpamAssassin (very crudely by commenting the relevant imports and
replacing the actually checking with a no-op). The test with the small
mail described above yields:
PSS = 13829 RSS = 82420
so ~75-80% of memory of each pmg-smtp-filter seems used by SpamAssassin.
Not that there's not quite a bit of room for improvement in the code-base,
I'd probably first look at pmgpolicy for rewriting first (simply because of
SpamAssassin, and my doubt, that running SA in another way (spamd[0],
writing our own spamd in rust) will make it significantly faster/smaller).
summing up - the warn for low memory Thomas suggested sounds like a good
idea, I'd probably only stick with 2 branches (<=2.5GB and >2.5GB) - but
fine either way
my tests with small ruleset, but many filter-processes put the
memory-requirement (PSS-based) at <70MiB/filter-process
More information about the pmg-devel
mailing list