[pbs-devel] [PATCH proxmox v2 1/4] rest-server: handle failure in worker task setup correctly

Fabian Grünbichler f.gruenbichler at proxmox.com
Mon Dec 2 14:04:09 CET 2024


if setting up a new worker fails after it has been inserted into the
WORKER_TASK_LIST, we need to clean it up instead of bubbling up the error right
away, else we "leak" the worker task and it never finishes..

a worker task that never finishes will indefinitely block shutdown
of the rest server process, including the "old" process when reloading
the rest server.

this issue was found in the wild on a system with lock contention on the
file-based lock covering task index updating leading to lock acquiring
timeouts.

Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
---
 proxmox-rest-server/src/worker_task.rs | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/proxmox-rest-server/src/worker_task.rs b/proxmox-rest-server/src/worker_task.rs
index 6e76c2ca..3ca93965 100644
--- a/proxmox-rest-server/src/worker_task.rs
+++ b/proxmox-rest-server/src/worker_task.rs
@@ -923,7 +923,12 @@ impl WorkerTask {
             set_worker_count(hash.len());
         }
 
-        setup.update_active_workers(Some(&upid))?;
+        let res = setup.update_active_workers(Some(&upid));
+        if res.is_err() {
+            // needed to undo the insertion into WORKER_TASK_LIST above
+            worker.log_result(&res);
+            res?
+        }
 
         Ok((worker, logger))
     }
-- 
2.39.5





More information about the pbs-devel mailing list