[pve-devel] pve api offline during log rotation

Stefan Priebe - Profihost AG s.priebe at profihost.ag
Wed Nov 8 07:41:07 CET 2017


Hello,

any news on this? Is this expected?

Thanks,
Stefan

Am 27.09.2017 um 08:21 schrieb Stefan Priebe - Profihost AG:
> Hi,
> 
> Am 21.09.2017 um 15:30 schrieb Thomas Lamprecht:
>> On 09/20/2017 01:26 PM, Stefan Priebe - Profihost AG wrote:
>>> Hi,
>>>
>>>
>>> Am 20.09.2017 um 10:36 schrieb Thomas Lamprecht:
>>>> On 09/20/2017 06:40 AM, Stefan Priebe - Profihost AG wrote:
>>>>> Nobody?
>>>>>
>>>>
>>>> We register the restart command from pveproxy with the $use_hup
>>>> parameter,
>>>> this then send a SIGHUP when calling pveproxy restart - which gets
>>>> mapped to
>>>> systemctl reload-or-restart pveproxy.service, which sends the HUP
>>>> signal to
>>>> the main process, which in turn does an exec on itself. So  a restart
>>>> is a
>>>> reload in this case.
>>>>
>>>> As we already use use our Daemons class' "leave_children_open_on_reload"
>>>> functionality to keep the workers running during a restart open
>>>> connection
>>>> should stay. But as the worker always get  a SIGTERM even with this
>>>> option
>>>> this is not always the case, I will investigate this behavior but this
>>>> is not
>>>> responsible for longer restart times.
>>>>
>>>> Is suspect that you suffered from a bug we fixed about a week ago [1]
>>>> where
>>>> worker signal handlers got overwritten and thus daemons could always be
>>>> stopped grafully. restarting may have been affected to by it.
>>>>
>>>> The fix is currently already in the no-subscription repo.
>>>>
>>>> If you still experience this behavior could you please do a
>>>> `pveproxy restart` and post the logs from during the restart,
>>>> something like:
>>>> # journalctl -u pveproxy.service --since -10min
>>>
>>>
>>> thanks for the reply. Does this also apply to PVE 4? Sorry i missed that
>>> info.
>>>
>>
>> Yes. But it should happen more often with PVE 5.
>> Affecting packages had the fix already backported and applied in git.
>> It may need a little time until all get released to all debian repos,
>> though.
> 
> sorry for the late reply. I updated all pkgs to latest git stable-4 branch.
> 
> But it still happens. On 06:25 i get a lot of:
> 595 Connection refused
> or
> 500 Can't connect to XXXXXX.de:8006
> 
> messages.
> 
> The journal looks like this:
> Sep 27 06:25:03 systemd[1]: Stopping PVE API Proxy Server...
> Sep 27 06:25:04 pveproxy[47485]: received signal TERM
> Sep 27 06:25:04 pveproxy[47485]: server closing
> Sep 27 06:25:04 pveproxy[25445]: worker exit
> Sep 27 06:25:04 pveproxy[24339]: worker exit
> Sep 27 06:25:04 pveproxy[16705]: worker exit
> Sep 27 06:25:04 pveproxy[52966]: worker exit
> Sep 27 06:25:04 pveproxy[16550]: worker exit
> Sep 27 06:25:04 pveproxy[18493]: worker exit
> Sep 27 06:25:04 pveproxy[3140]: worker exit
> Sep 27 06:25:04 pveproxy[12704]: worker exit
> Sep 27 06:25:04 pveproxy[13839]: worker exit
> Sep 27 06:25:04 pveproxy[47485]: worker 16705 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 13839 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 12704 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 17652 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 3140 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 52966 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 24339 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 25445 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 18493 finished
> Sep 27 06:25:04 pveproxy[47485]: worker 16550 finished
> Sep 27 06:25:04 pveproxy[47485]: server stopped
> Sep 27 06:25:05 systemd[1]: Starting PVE API Proxy Server...
> Sep 27 06:25:06 pveproxy[28612]: Using '/etc/pve/local/pveproxy-ssl.pem'
> as certificate for the web interface.
> Sep 27 06:25:06 pveproxy[28617]: starting server
> Sep 27 06:25:06 pveproxy[28617]: starting 10 worker(s)
> Sep 27 06:25:06 pveproxy[28617]: worker 28618 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28619 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28620 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28621 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28622 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28623 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28624 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28625 started
> Sep 27 06:25:06 pveproxy[28617]: worker 28626 started
> Sep 27 06:25:06 systemd[1]: Started PVE API Proxy Server.
> Sep 27 06:25:06 pveproxy[28617]: worker 28628 started
> 
> The problem still is that pveproxy is stopped and started again. So
> there is a gap where no new connections get accepted.
> 
> Normal behaviour for other deamons i know is to use a special SIGNAL to
> just reopen the logs and do not stop and start the daemon.
> 
> Greets,
> Stefan
> 



More information about the pve-devel mailing list