[pve-devel] pve api offline during log rotation

Stefan Priebe - Profihost AG s.priebe at profihost.ag
Wed Sep 27 08:21:09 CEST 2017


Hi,

Am 21.09.2017 um 15:30 schrieb Thomas Lamprecht:
> On 09/20/2017 01:26 PM, Stefan Priebe - Profihost AG wrote:
>> Hi,
>>
>>
>> Am 20.09.2017 um 10:36 schrieb Thomas Lamprecht:
>>> On 09/20/2017 06:40 AM, Stefan Priebe - Profihost AG wrote:
>>>> Nobody?
>>>>
>>>
>>> We register the restart command from pveproxy with the $use_hup
>>> parameter,
>>> this then send a SIGHUP when calling pveproxy restart - which gets
>>> mapped to
>>> systemctl reload-or-restart pveproxy.service, which sends the HUP
>>> signal to
>>> the main process, which in turn does an exec on itself. So  a restart
>>> is a
>>> reload in this case.
>>>
>>> As we already use use our Daemons class' "leave_children_open_on_reload"
>>> functionality to keep the workers running during a restart open
>>> connection
>>> should stay. But as the worker always get  a SIGTERM even with this
>>> option
>>> this is not always the case, I will investigate this behavior but this
>>> is not
>>> responsible for longer restart times.
>>>
>>> Is suspect that you suffered from a bug we fixed about a week ago [1]
>>> where
>>> worker signal handlers got overwritten and thus daemons could always be
>>> stopped grafully. restarting may have been affected to by it.
>>>
>>> The fix is currently already in the no-subscription repo.
>>>
>>> If you still experience this behavior could you please do a
>>> `pveproxy restart` and post the logs from during the restart,
>>> something like:
>>> # journalctl -u pveproxy.service --since -10min
>>
>>
>> thanks for the reply. Does this also apply to PVE 4? Sorry i missed that
>> info.
>>
> 
> Yes. But it should happen more often with PVE 5.
> Affecting packages had the fix already backported and applied in git.
> It may need a little time until all get released to all debian repos,
> though.

sorry for the late reply. I updated all pkgs to latest git stable-4 branch.

But it still happens. On 06:25 i get a lot of:
595 Connection refused
or
500 Can't connect to XXXXXX.de:8006

messages.

The journal looks like this:
Sep 27 06:25:03 systemd[1]: Stopping PVE API Proxy Server...
Sep 27 06:25:04 pveproxy[47485]: received signal TERM
Sep 27 06:25:04 pveproxy[47485]: server closing
Sep 27 06:25:04 pveproxy[25445]: worker exit
Sep 27 06:25:04 pveproxy[24339]: worker exit
Sep 27 06:25:04 pveproxy[16705]: worker exit
Sep 27 06:25:04 pveproxy[52966]: worker exit
Sep 27 06:25:04 pveproxy[16550]: worker exit
Sep 27 06:25:04 pveproxy[18493]: worker exit
Sep 27 06:25:04 pveproxy[3140]: worker exit
Sep 27 06:25:04 pveproxy[12704]: worker exit
Sep 27 06:25:04 pveproxy[13839]: worker exit
Sep 27 06:25:04 pveproxy[47485]: worker 16705 finished
Sep 27 06:25:04 pveproxy[47485]: worker 13839 finished
Sep 27 06:25:04 pveproxy[47485]: worker 12704 finished
Sep 27 06:25:04 pveproxy[47485]: worker 17652 finished
Sep 27 06:25:04 pveproxy[47485]: worker 3140 finished
Sep 27 06:25:04 pveproxy[47485]: worker 52966 finished
Sep 27 06:25:04 pveproxy[47485]: worker 24339 finished
Sep 27 06:25:04 pveproxy[47485]: worker 25445 finished
Sep 27 06:25:04 pveproxy[47485]: worker 18493 finished
Sep 27 06:25:04 pveproxy[47485]: worker 16550 finished
Sep 27 06:25:04 pveproxy[47485]: server stopped
Sep 27 06:25:05 systemd[1]: Starting PVE API Proxy Server...
Sep 27 06:25:06 pveproxy[28612]: Using '/etc/pve/local/pveproxy-ssl.pem'
as certificate for the web interface.
Sep 27 06:25:06 pveproxy[28617]: starting server
Sep 27 06:25:06 pveproxy[28617]: starting 10 worker(s)
Sep 27 06:25:06 pveproxy[28617]: worker 28618 started
Sep 27 06:25:06 pveproxy[28617]: worker 28619 started
Sep 27 06:25:06 pveproxy[28617]: worker 28620 started
Sep 27 06:25:06 pveproxy[28617]: worker 28621 started
Sep 27 06:25:06 pveproxy[28617]: worker 28622 started
Sep 27 06:25:06 pveproxy[28617]: worker 28623 started
Sep 27 06:25:06 pveproxy[28617]: worker 28624 started
Sep 27 06:25:06 pveproxy[28617]: worker 28625 started
Sep 27 06:25:06 pveproxy[28617]: worker 28626 started
Sep 27 06:25:06 systemd[1]: Started PVE API Proxy Server.
Sep 27 06:25:06 pveproxy[28617]: worker 28628 started

The problem still is that pveproxy is stopped and started again. So
there is a gap where no new connections get accepted.

Normal behaviour for other deamons i know is to use a special SIGNAL to
just reopen the logs and do not stop and start the daemon.

Greets,
Stefan



More information about the pve-devel mailing list