[pbs-devel] [RFC PATCH 4/5] fix #3935: datastore: keep manifest locks, avoid similar locking issue

Stefan Sterz s.sterz at proxmox.com
Fri Mar 18 15:06:54 CET 2022

just as removing the directories we locked on is an issue, so is
removing the locks for the manifest.

Signed-off-by: Stefan Sterz <s.sterz at proxmox.com>
Similarly to the issue that led to this patch series the following
seems plausible to me:

A wants to remove a group, locks group, locks manifest
  (via `remove_group`)

    B wants to interact with the manifest, begins locking manifest via
      `lock_manifest`, it opens file handle to the lock file

A deletes group, deletes manifest      

        C wants to interact with the manifest, re-creates lock via
          `lock_manifest`, now holds a lock to the manifest

A drops group and manifest lock

    B acquires lock

B and C now hold the lock to the same manifest

I am not sure how likely this is or how "bad", but given that
`lock_manifest` has a 5 second timeout, it might be more likely than
it appears at first. Correct me if I am wrong though.

If I am not mistaken, this is also the reason why we can
basically _never_ remove a file/directory that acts as a lock. One
solution would be to require a higher lock protecting the lower lock,
which would need to be acquired by all threads interacting with the
given object. However, this basically negates the principle of
acquiring locks with the least privileges and would more or less
converge towards a global lock.

For example: To remove a snapshot lock, we need to lock the group. If
its the last snapshot and we want to remove the group, we'd need to
lock the data store. Now every thread that wants to access a
snapshot, needs to acquire a data store lock.

There might be a nicer solution that I am missing. If I am wrong
please tell me :)

 pbs-datastore/src/datastore.rs | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
index 11bd578d..60383860 100644
--- a/pbs-datastore/src/datastore.rs
+++ b/pbs-datastore/src/datastore.rs
@@ -346,12 +346,6 @@ impl DataStore {
-        // the manifest does not exists anymore, we do not need to keep the lock
-        if let Ok(path) = self.manifest_lock_path(backup_dir) {
-            // ignore errors
-            let _ = std::fs::remove_file(path);
-        }

More information about the pbs-devel mailing list