[pbs-devel] [PATCH proxmox v2 0/4] worker task setup improvements

Fabian Grünbichler f.gruenbichler at proxmox.com
Mon Dec 2 14:04:08 CET 2024


This series fixes two issues related to reset server shutdown/reload and
worker task accounting.

issue 1 (patches 1, rfcs 3, 4):

if WorkerTask::new returned an error because updating the task indices
failed (for example, because the lock covering such updates was not
acquired because of a timeout), the task was registered in the worker
task list, but not returned to the caller, which means the task could
never actually execute and reach its cleanup/log_result stage, which
would unregister it again. effectively, in such a scenario the worker
task is "leaked", but the task count decrement never happens, which in
turn means the corresponding proxy can never shutdown, since it will
wait for the phantom task to finish forever.

this issue was actually found in the wild on a system with lots of
activity.

issue 2 (patch 2):

a lock scope issue could cause a temporary inconsistency between the
task list and task count, if multiple tasks log their result in
parallel. the discrepancy disappars with the next task that is created
or logs its result, since the count is always reset to the current count
and not incremented/decremented.

this issue was found while analyzing the code.

Fabian Grünbichler (4):
  rest-server: handle failure in worker task setup correctly
  rest-server: close race window when updating worker task count
  rest-server: make worker task creation error handling more idiomatic
  rest-server: increase task index lock timeout to 15s

 proxmox-rest-server/src/worker_task.rs | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

-- 
2.39.5





More information about the pbs-devel mailing list