[pve-devel] [PATCH manager] fix #4111: replication: don't send mail when fail count is zero

Fabian Ebner f.ebner at proxmox.com
Tue Jun 14 11:47:32 CEST 2022

which can happen when failing to obtain the guest's migration lock.
This led to a lot of mails being sent during migration (timeout for
obtaining lock is only 2 seconds and we run it in a loop).

One could argue that obtaining the lock should increase the fail
count, but without the lock, the job state should not be touched and
even the first three mails upon migration could be considered spam.

Fixes: e6b8af20 ("replication: sent always mail for first three tries and move helper")
Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
 PVE/API2/Replication.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/PVE/API2/Replication.pm b/PVE/API2/Replication.pm
index 522aa3bf..af77d2f4 100644
--- a/PVE/API2/Replication.pm
+++ b/PVE/API2/Replication.pm
@@ -77,6 +77,10 @@ sub run_single_job {
 my sub _should_mail_at_failcount {
     my ($fail_count) = @_;
+    # avoid spam during migration (bug #4111): when failing to obtain the guest's migration lock,
+    # fail_count will be 0
+    return 0 if $fail_count == 0;
     return 1 if $fail_count <= 3; # always send the first few for better visibility of the issue
     # failing job is re-tried every half hour, try to send one mail after 1, 2, 4, 8, etc. days

More information about the pve-devel mailing list