[pve-devel] [PATCH common 1/3] daemon: don't send SIGTERM before restart on leave_children_open_on_reload
Thomas Lamprecht
t.lamprecht at proxmox.com
Thu Nov 9 14:56:27 CET 2017
Else this options is not really useful. First, sending a SIGTERM lets
the children exit, not quite what "leave_children_open_on_reload"
promises.
The problem this causes is that we may get a time window where no
worker is active and thus, for example, our API daemon would not
accept connections during a restart (or better said, reload).
So, don't request termination of any child worker, if this option is
set, but rather just restart (re-exec) ourself, startup a new set of
workers and only then request the termination of the old ones,
allowing a fully seamless reload.
This is only done on `$daemon-exe restart` and thus on
`systemctl reload $daemon`, systemctl restart or any other stop start
cycles always exit all other workers first.
This expects that the worker can do a graceful termination on
SIGTERM, which is already the case for anything using our AnyEvent
based class (which is base of our HTTPServer module).
With graceful termination is meant the following: the worker accepts
no new work and exits immediately after the current queued work is
done.
Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---
src/PVE/Daemon.pm | 31 ++++++++++++++++++-------------
1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/src/PVE/Daemon.pm b/src/PVE/Daemon.pm
index 9d72c32..a6b58d1 100644
--- a/src/PVE/Daemon.pm
+++ b/src/PVE/Daemon.pm
@@ -184,6 +184,19 @@ my $start_workers = sub {
}
};
+my $terminate_old_workers = sub {
+ my ($self) = @_;
+
+ my $cpids = [ keys %{$self->{old_workers}} ];
+
+ return if !($cpids && scalar(@$cpids) > 0);
+
+ # request graceful exit, no need for waitpid we have a SIGCHLD handler
+ foreach my $cpid (@$cpids) {
+ kill 15 => $cpid;
+ }
+};
+
my $terminate_server = sub {
my ($self, $allow_open_children) = @_;
@@ -198,19 +211,13 @@ my $terminate_server = sub {
eval { $self->shutdown(); };
warn $@ if $@;
- # we have workers - send TERM signal
-
- foreach my $cpid (keys %{$self->{workers}}) {
- kill(15, $cpid); # TERM childs
- }
# if configured, leave children running on HUP
- return if $allow_open_children &&
- $self->{leave_children_open_on_reload};
+ return if $allow_open_children && $self->{leave_children_open_on_reload};
- # else, send TERM to old workers
- foreach my $cpid (keys %{$self->{old_workers}}) {
- kill(15, $cpid); # TERM childs
+ # else send TERM to all (old and current) child workers
+ foreach my $cpid (keys %{$self->@{'workers','old_workers'}}) {
+ kill(15, $cpid);
}
# nicely shutdown childs (give them max 10 seconds to shut down)
@@ -395,13 +402,11 @@ my $server_run = sub {
&$old_sig_chld(@_) if $old_sig_chld;
};
- # catch worker finished during restart phase
- &$finish_workers($self);
-
# now loop forever (until we receive terminate signal)
for (;;) {
&$start_workers($self);
sleep(5);
+ &$terminate_old_workers($self);
&$finish_workers($self);
last if $self->{terminate};
}
--
2.11.0
More information about the pve-devel
mailing list