[pve-devel] [PATCH qemu-server v2 1/3] qmeventd: rework 'forced_cleanup' handling and set timeout to 60s
Wolfgang Bumiller
w.bumiller at proxmox.com
Fri Sep 23 10:16:06 CEST 2022
On Thu, Sep 22, 2022 at 04:19:33PM +0200, Dominik Csapak wrote:
> currently, the 'forced_cleanup' (sending SIGKILL to the qemu process),
> is intended to be triggered 5 seconds after sending the initial shutdown
> signal (SIGTERM) which is sadly not enough for some setups.
>
> Accidentally, it could be triggered earlier than 5 seconds, if a
> SIGALRM triggers in the timespan directly before setting it again.
>
> Also, this approach means that depending on when machines are shutdown
> their forced cleanup may happen after 5 seconds, or any time after, if
> new vms are shut off in the meantime.
>
> Improve this situation by reworking the way we deal with this cleanup.
> We save the time incl. timeout in the CleanupData, and set a timeout
> to 'epoll_wait' of 10 seconds, which will then trigger a forced_cleanup.
> Remove entries from the forced_cleanup list when that entry is killed,
> or when the normal cleanup took place.
>
> To improve the shutdown behaviour, increase the default timeout to 60
> seconds, which should be enough, but add a commandline toggle where
> users can set it to a different value.
>
> Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
> ---
> qmeventd/qmeventd.c | 73 +++++++++++++++++++++++----------------------
> qmeventd/qmeventd.h | 2 ++
> 2 files changed, 39 insertions(+), 36 deletions(-)
>
> diff --git a/qmeventd/qmeventd.c b/qmeventd/qmeventd.c
> index 8d32827..46bc7eb 100644
> --- a/qmeventd/qmeventd.c
> +++ b/qmeventd/qmeventd.c
> @@ -551,27 +558,16 @@ handle_client(struct Client *client)
> json_tokener_free(tok);
> }
>
> -
> -/*
> - * SIGALRM and cleanup handling
> - *
> - * terminate_client will set an alarm for 5 seconds and add its client's PID to
> - * the forced_cleanups list - when the timer expires, we iterate the list and
> - * attempt to issue SIGKILL to all processes which haven't yet stopped.
> - */
> -
> -static void
> -alarm_handler(__attribute__((unused)) int signum)
> -{
> - alarm_triggered = 1;
> -}
> -
> static void
> -sigkill(void *ptr, __attribute__((unused)) void *unused)
If you change the style here... (which I'm not a fan of btw.)
> +sigkill(struct CleanupData *ptr, time_t *cur_time)
> {
> - struct CleanupData data = *((struct CleanupData *)ptr);
> + struct CleanupData data = *ptr;
...at least get rid of this line completely ^
and just use `ptr->` instea of `data.`, I see no reason to keep copying
the data onto the stack?
(or with the old style, make `data` a pointer and skip the cast)
> int err;
>
> + if (data.timeout > *cur_time) {
> + return;
> + }
> +
> if (data.pidfd > 0) {
> err = pidfd_send_signal(data.pidfd, SIGKILL, NULL, 0);
> (void)close(data.pidfd);
More information about the pve-devel
mailing list