[pve-devel] [PATCH common v2 1/3] daemon: don't send SIGTERM before restart on leave_children_open_on_reload

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Nov 10 12:09:27 CET 2017


Else this options is not really useful. First, sending a SIGTERM lets
the children exit, not quite what "leave_children_open_on_reload"
promises.

The problem this causes is that we may get a time window where no
worker is active and thus, for example, our API daemon would not
accept connections during a restart (or better said, reload).

So, don't request termination of any child worker, if this option is
set, but rather just restart (re-exec) ourself, startup a new set of
workers and only then request the termination of the old ones,
allowing a fully seamless reload.

This is only done on `$daemon-exe restart` and thus on
`systemctl reload $daemon`, systemctl restart or any other stop start
cycles always exit all other workers first.

This expects that the worker can do a graceful termination on
SIGTERM, which is already the case for anything using our AnyEvent
based class (which is base of our HTTPServer module).
With graceful termination is meant the following: the worker accepts
no new work and exits immediately after the current queued work is
done.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---

changes v1 -> v2:
* optimize out intermediate arrays as we can pass a list to kill (if list is
  empty nothing relevant happens here)

 src/PVE/Daemon.pm | 25 +++++++++++--------------
 1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/src/PVE/Daemon.pm b/src/PVE/Daemon.pm
index 9d72c32..d438d27 100644
--- a/src/PVE/Daemon.pm
+++ b/src/PVE/Daemon.pm
@@ -184,6 +184,13 @@ my $start_workers = sub {
     }
 };
 
+my $terminate_old_workers = sub {
+    my ($self) = @_;
+
+    # if list is empty kill sends no signal, so no checks needed
+    kill 15, keys %{$self->{old_workers}};
+};
+
 my $terminate_server = sub {
     my ($self, $allow_open_children) = @_;
 
@@ -198,20 +205,12 @@ my $terminate_server = sub {
     eval { $self->shutdown(); };
     warn $@ if $@;
 
-    # we have workers - send TERM signal
-
-    foreach my $cpid (keys %{$self->{workers}}) {
-	kill(15, $cpid); # TERM childs
-    }
 
     # if configured, leave children running on HUP
-    return if $allow_open_children &&
-	$self->{leave_children_open_on_reload};
+    return if $allow_open_children && $self->{leave_children_open_on_reload};
 
-    # else, send TERM to old workers
-    foreach my $cpid (keys %{$self->{old_workers}}) {
-	kill(15, $cpid); # TERM childs
-    }
+    # else send TERM to all (old and current) child workers
+    kill 15, keys %{$self->@{'workers','old_workers'}};
 
     # nicely shutdown childs (give them max 10 seconds to shut down)
     my $previous_alarm = alarm(10);
@@ -395,13 +394,11 @@ my $server_run = sub {
 		&$old_sig_chld(@_) if $old_sig_chld;
 	    };
 
-	    # catch worker finished during restart phase 
-	    &$finish_workers($self);
-
 	    # now loop forever (until we receive terminate signal)
 	    for (;;) { 
 		&$start_workers($self);
 		sleep(5);
+		&$terminate_old_workers($self);
 		&$finish_workers($self);
 		last if $self->{terminate};
 	    }
-- 
2.11.0





More information about the pve-devel mailing list