[pve-devel] pve api offline during log rotation

Mon Dec 11 06:58:36 CET 2017

Sorry for the late reply. Is there a chance to get the fix backported to 4.4?

Greets,
Stefan

Excuse my typo sent from my mobile phone.

> Am 09.11.2017 um 13:40 schrieb Stefan Priebe - Profihost AG <s.priebe at profihost.ag>:
> 
> *arg* sorry about that and thanks for resending your last paragraph. Yes
> that's exactly the point.
> 
> Also thanks for the restart and systemctl explanation.
> 
> Greets,
> Stefan
> 
>> Am 09.11.2017 um 13:35 schrieb Thomas Lamprecht:
>> Hi,
>> 
>>> On 11/09/2017 01:08 PM, Stefan Priebe - Profihost AG wrote:
>>> yes that's what i'm talking about. The logfile rotation script DOES a
>>> restart not a reload.
>>> 
>> 
>> No it doesn't do a systemd restart, it's a bit confusing - I know.
>> 
>>> See here:
>>> https://git.proxmox.com/?p=pve-manager.git;a=blob_plain;f=debian/pve.logrotate;hb=HEAD
>>> 
>> 
>> It does a `pveproxy restart`, which _is_ a real graceful "fast" restart,
>> not to be confused with `systemctl restart pveproxy` which does a full
>> stop first, and then a full new startup again.
>> 
>> `systemctl reload pveproxy` does the exact same as the logrotation, see:
>> 
>> https://git.proxmox.com/?p=pve-manager.git;a=blob_plain;f=bin/init.d/pveproxy.service
>> 
>> Thus the problem is not this but the one I described in the last paragraph
>> from my last answer:
>> 
>>> On 11/09/2017 11:03 AM, Thomas Lamprecht wrote:
>>> 
>>> We do a re-exec on "ourself" (from the daemons POV), and the intend is to
>>> leave non-idle child workers untouched, but the logic doing this is a bit
>>> flawed as all current worker child always receive a TERM signal.
>>> Here the HTTP server worker wait at least for active connection to end,
>>> but new ones do not get accepted. We directly restart after that, but yes,
>>> depending on load there can be a time window where no one is there to
>>> accept connections.
>>> I'd rather not send the TERM signal in the case where the
>>> "leave_children_open_on_reload" option is set and we're restarting but
>>> just restart, passing the current worker PIDs over to our new self
>>> (this gets already done). There on startup then start new workers and
>>> message the old ones to not  accept new connections and terminate
>>> gracefully as soon as possible. Now there is never a time where no active
>>> listening worker would there. I try to give it a look.
>>> 
>> 
>> cheers,
>> Thomas
>>