[pmg-devel] [RFC pmg-api/docs] minimal before queue filtering support

Tue Nov 12 15:16:09 CET 2019

This patchset should eventually address a very often asked-for feature for PMG:
before-queue filtering (apart from 1/4 for pmg-api, which is a tiny nit I
caught by accident).

Technically the work follows the postfix before queue content filtering howto
[0], which makes it possible to scan before-queue without the need to implement
a dedicated milter-protocol.

pmg-smtp-filter (rather PMG::SMTP) already had a currently unused branch for
handling SMTP connections (the current after-queue solution uses LMTP), which
only needed a slight adaptation.

Handling mails with single recepients now should work without any problems:

* if the result of the rule-system is 'Block' pmg-smtp-filter rejects the mail
  with '554 5.7.1 Rejected for policy reasons' (the same response code
  postscreen uses for rbl-hits)

* if the result is either 'Accept' or 'Quarantine' pmg-smtp-filter accepts the
  mail

* if there's a problem in handing the mail back to postfix (10025) then the
  response is a temporary failure

The situation is slightly more complicated (I'd say a general thing with SMTP)
if one mail is to be delivered to multiple recepients:

* if the rule-system 'Blocks' the mail for all recepients - the mail gets
  rejected (with 554)

* if at least one recepient accepts the mail pmg-smtp-filter returns 250.
  Additionally in order to be compliant with the requirement some users have
  of never dropping mail, without notification, a bounce-message (NDR) is
  generated for all users, which 'Blocked' the message (if any).
  The sending of the NDR can be configured with a flag in pmg.conf (as can the
  activiation of before queue filtering).
  The different result for multiple users can probably happen in the default
  ruleset of PMG (by the User Black/Whitelist), or by (probably too) complicated
  rulesets.

Given that the smtpd_proxy_filter is called quite late by the postfix pipeline
PMG still profits from the protections by postscreen, the pmgpolicy service
(greylisting, hard SPF evaluation).

Things still missing in the RFC:
* the bounces generated are not yet adapted to RFC 6533 (internationalized
  bounces when announcing SMTPUTF8 extension)

Preliminary Tests:
Given that replacing the postfix smtpd (and its queue) on the front-line by
pmg-smtp-filter (which is not the fastest, since it does quite a lot (mostly
run spamassassin analyze)) will have some effect on the behavior of the system
I tried running 2 test-scenarios:
* use 'postal' [1] for benchmarking:
** setup: `timeout 2m postal -M 25 -m 500 -t 10 -c 50 -f senders <pmg-ip> recepients`
   (run 10 threads each sending 50 mails before opening a new connection
   sending random text (short of the minimal set of headers) between 25k and
   500k)
** the random data seems to be (probably not too much of a surprise) a rather
   bad case scenario for SpamAssassin (many complicated regex-matches for
   mail-text) - the processing time per mail was on average between 10s and
   120s (99.99 % due to spamassassin)
** the throughput (mails actually going out of pmg) is roughly the same between
   before and after queue filtering
** with after-queue filtering 'postal' was (of course) able to deliver far
   more mails to PMG (2.5k in 120 seconds) - they were queued by postfix and
   would have been eventually delivered, but the output of PMG was the same
   (around 30-45)

* queue up 3x500 mails (2 rather small testmails and 1 350k mail)
  in a postfix-queue (on a separate host while postfix is not running) and
  start postfix (the runtimes for analyzing these mails is far closer to what
  we see in production (<1s - 3s (the large mail))
** with this test-set the queue in the original postfix got emptied within
   3 minutes (yielding about 8.3 mails/second on my test-installation)

While looking around the extremely long time spamassassin took for the random
data mails - I tried to precompile the spamassassin rules with sa-compile [2].
Result:
* on the random data quite a speedup was achieved (115 vs 45)
* on the 3x500 the processing time did not change too much (if noticeable at
  all).

Would be very grateful for feedback (especially suggestions for further testing
which would make sense)

[0] http://www.postfix.org/SMTPD_PROXY_README.html
[1] https://esmtp.email/blog/2017/11/04/postal-benchmark/
[2] https://cwiki.apache.org/confluence/display/spamassassin/FasterPerformance

pmg-api:
Stoiko Ivanov (4):
  add missing use MIME::Entity in PMG::Utils
  PMG::Config: refactor dns info collection
  add generate_ndr to PMG::SMTP
  add support for before queue filtering

 src/PMG/Config.pm          | 42 +++++++++++++++----
 src/PMG/SMTP.pm            | 85 ++++++++++++++++++++++++++++++++++----
 src/PMG/Utils.pm           |  1 +
 src/templates/master.cf.in | 14 +++++++
 4 files changed, 124 insertions(+), 18 deletions(-)

pmg-docs:
Stoiko Ivanov (1):
  add before_queue params to gen-pmg.conf.5.-opts.pl

 gen-pmg.conf.5-opts.pl | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.20.1