[pve-devel] [PATCH cluster/manager/ha-manager/proxmox{, -perl-rs} 00/18] fix #4156: introduce new notification module

Fri Apr 14 11:47:01 CEST 2023

First of all, thanks for the feedback!

On 4/14/23 08:19, Thomas Lamprecht wrote:
> 
> Hmmm, this is something that I did not think to closely about when drafting
> the (very rough) design for the notification overhaul, but it has some merit
> to be able to configure this per job – but naturally it also adds some
> redundancy with a global filtering system. IOW., we still need something
> not all too compley for handling a job for, say, not so important test virtual
> guests and one for production virtual guests. FWIW this could also be handled
> by having a fixed mapping for some overides and jobs, i.e., the global
> notification config could hold stanzas that match to a specific job (or other
> notification emitting type) which is then also exposed when editing said job
> directly. Otherwise, as we have now a pretty generic datacenter wide config
> for almost all relevant jobs Proxmox VE can handle, we could also add a property
> there that allows to match some notification policy/filter/config – that might
> be cleaner as it avoids having to read/handle two configs in one API call.

In an earlier on-paper draft of the whole notification overhaul I planned but then discarded
the concept of 'notification channels'. A notification channel would be ... well ... a channel
where jobs etc. can emit notifications to. In the notification configuration, one would then
be able to configure which endpoints are used for each channel. Jobs can configure which,
if at all, channel they emit notifications to. For some use cases, this would eliminate the
need for filtering (maybe except for severity) completely.

In this approach, the whole 'ephemeral sendmail endpoint' for VZDump jobs with configured email
notifications would/could be promoted to be an 'ephemeral channel', covering two use cases:
   a) graceful switch-over to the new notification backend, without breaking anything
   b) a 'short cut' for those users who don't really care about anything other than emails

The major difference from the user's perspective between 'ephemeral channel' and 'ephemeral endpoint'
would be that the channel sends *only* to the manually set address, without bothering about any other
endpoints. This would also solve the 'problem' where a notification might be sent twice when a job has a
the mailto property set *and* there is a sendmail endpoint with the same address.

To illustrate the concept, a short example. Let's assume we have some important production VMs.
The backup job for these could notify via the 'notify-prod' channel, which will ultimately fire out
notifications on several, high-priority endpoints (e.g. mail *and* some form of push notification).
Other, less important VMs could be backed up by a job that emits notifications to a different, low prio
channel (e.g. 'notify-non-prod'). For yet some other VMs, a specific person needs to be notified, so
there the admin chooses the 'direct email, without bells and whistles'-option, where they can
simply add the mail address in the VZDump job dialog.

I hope I could illustrate the idea, what do you think?

> 
>> The
>> drawback of this approach is that we might send a notification twice
>> in case a user has created another sendmail endpoint with the same
>> recipient. However, this is still better than not sending a
>> notification at all. However, I'm open for suggestions for other
>> approaches for maintaining compatibility.
>>
>> Alternative approaches that came to my mind are:
>>    - Explicitly break existing mail notifications with a major
>>      release, maybe create a default sendmail endpoint where *all*
>>      notifications are sent to root at pam via a sendmail endpoint.
>>      In this variant, the custom mail recipients for vzdump jobs would
>>      be removed
>>    - On update, create endpoints in the new configuration file for all
>>      possible email addresses, along with appropriate filters so that
>>      the behavior for every component currently sending mails remains
>>      unchanged. I don't really like this approach, as it might lead to
>>      a pretty messy 'notifications.cfg' with lots of endpoints/filters, if
>>      the user has lots of different backup jobs configured.
> 
> 
> If we go that way we could do the ephermal sendpoint only heuristically,
> i.e., check if there's another (email) notifaction endpoint that matches
> ours and the `root at localhost || $configured_root_at_pam_email` and silence
> it then, as that means that the admin switched over to the new mechanism.
> 
> Or, much simpler and transparent, some node/cluster global flag in e.g.,
> vzdump.conf.
> 
Good idea! However, as mentioned, this would no longer be necessary with the
approach described above.

>   
>> Side effects of the changes:
>>    - vzdump loses the HTML table which lists VMs/CTs. Since different
>>      endpoints support different markup styles for formatting, it
>>      does not really make sense to support this in the notification API, at
>>      least not without 'cross-rendering' (e.g. markdown --> HTML)
> 
> 
> This I'd think should be avoided if just anyhow possible, as that is
> quite the nice feature. For keeping that possible we could add a optional
> strucutred data object that can be passed to the notifaction system and
> that (some) plugins can then use to show some more structured data.
> 
> Simplest might be a Option<serde_json::Value> that has a defined structure
> like for example (psuedo json):
> 
> {
>    schema: {
>      type: "table", // or object
>      rows: ["vmid", "job", "..."],
>      // ... ?
>    },
>    data: [
>       { vmid: 100, job: "foo", "...": "..."},
>       ...
>    ],
> }
> 
> The mail plugin could then produce the two tables again, for the text/plain
> and text/html multiparts again.
> 

I've thought about this before, but ultimately discarded the idea since I was not sure
whether it was worth the effort. The HTML table was looking rather crude before, so I
did personally not see the benefit of it over a nice plain-text table.

I'll give this approach a try for v2. I'll have to think how this will play
together with the mentioned registered/templated notifications.

Having the data structured in the way you propose will also be beneficial once other
notification endpoints are implemented, e.g. webhook with JSON payload. There,
a plain text table does not make that much sense as a payload ;) .
Also, having HTML rendering in a single place will make it easy to add fancier styling later.

> nit, might use the verb form for this to be slightly shorter, i.e.:
> PVE::Notify::info

Good point, I'll rename it.
> 
>>
>>      In the future, the API might be changed/extended so that supports
>>      "registering" notifications. This allows us to a.) generate a
>>      list of all possible notification sources in the system b.) allows
>>      users to easily create filters for specific notification events.
>>      In my head, using the notification module could look like this
>>      then:
>>
>>      # Global context
>>      my  = PVE::Notification::register({
>>        'id' => 'backup-failed',
>>        'severity' => 'error',
>>        'properties' => ['host', 'vmlist', 'logs'],
>>        'title' => '{{ host }}: Backup failed'
>>        'body' => <<'EOF'
>>      A backup has failed for the following VMs: {{ vmlist }}
>>
>>      {{ logs }}
>>      EOF
>>      });
> 
> hmm, yeah having that in module space so that its called on load like our
> API endpoints is certainly the straight forward way, but having it act like
> it would be rust (i.e., the notifiy send call itself _is_ the registry) would
> be even nicer – but might be a tad too complex, as Wolfgang reject implementing
> a Perl parser in rust already :(
> 
I haven't really spent much thought yet on how the notification API would look in our
Rust-based products, but how could the `send` call fill the registry?
I assume this is only possible with some (procedural?) macro magic? Do we have similar
prior work somewhere?

>>
>>      # Later, to send the notification:
>>      PVE::Notification::send(->instantiate({
>>        'host' => 'earth',
>>        'vmlist' => ... ,
>>        'logs' => ... ,
>>      }));
>>
>>      Title and body effectively become templates that can be
>>      instantiated with arbitrary properties. This has the following
>>      benefits:
>>        - This ensures that notifications could be easily translated in
>>          the future
>>        - Allows us to add functionality that allows users to customize
>>          notification text, as wished in [2].
>>        - Properties can be used in filters (e.g. exact match, regex,
>>          etc.)
>>        - Properties can be listed in the list of notifications
>>          mentioned above, making it easier to create filters.
> 
> Yes that would be nice and they could also have access to the structured
> data, if passed as considered above.
> 

Agreed.

> all in all it doesn't sounds to bad, from a higer level the structured data
> to be able to keep special formats like html tables in backup notifiactions
> and the how to switch over, or better said, how to also have some control in
> the job-specific config about how/if notifications are emitted, are the two
> higher level issues we should IMO tackle before initial integration can be
> merged.

All right, thanks again for the feedback. I'll take your suggestions into account and
send a v2 once ready.

>>
>>
>> Summary over all repositories:
>>    34 files changed, 1464 insertions(+), 140 deletions(-)
>>
>> Generated by murpp v0.1.0
> 
> huh, what's this? some patch handling tool I don't know about?
> 
> 

It's something that I've been building on the side. I was kind of annoyed by
manually assembling cover letters, especially when the patch series spanned multiple
repos and when there were multiple iterations of the same patch series.
It's a small Rust (tm) tool that pretty much fully automates preparing patches for series
like these. You can write the text for the cover letter, title, series version, and included
repos/branches in a single configuration file (YAML for now, as this allows me to include the
text for the body in a multi-line string, but I'm looking for alternatives).
The tool will then generate all patches with consecutive patch numbering and generate the final
cover letter, including all diff-stats and a final summary.

This series is the first time I've used the tool and I'm pretty happy so far.
Apart from writing the cover letter, I did not have to put in any manual work to assemble this series;
a single `murpp generate` was enough to have this patch series ready for `git send-email`.

I've created a staff repo for it, in case anybody wants to try it out.

-- 
- Lukas