[pbs-devel] [RFC v2 proxmox-backup 10/21] datastore: mark namespace as trash instead of deleting it

Christian Ebner c.ebner at proxmox.com
Mon May 12 09:47:59 CEST 2025


On 5/9/25 14:27, Fabian Grünbichler wrote:
> On May 8, 2025 3:05 pm, Christian Ebner wrote:
>> As for backup snapshots and groups, mark the namespace as trash
>> instead of removing it and the contents right away, if the trash
>> should not be bypassed.
>>
>> Actual removal of the hirarchical folder structure has to be taken
>> care of by the garbage collection.
>>
>> In order to avoid races during removal, first mark the namespaces as
>> trash pending, mark the snapshots and groups as trash and only after
>> rename the pending marker file to the trash marker file. By this,
>> concurrent backups can remove the trash pending marker to avoid the
>> namespace being trashed.
>>
>> On re-creation of a trashed namespace remove the marker file on itself
>> and any parent component from deepest to shallowest. As trashing a full
>> namespace can also set the trash pending state for recursive namespace
>> cleanup, remove encounters of that marker file as well to avoid the
>> namespace or its parent being trashed.
> 
> this is fairly involved since we don't have locks on namespaces..
> 
> should we have them for creation/removal/trashing/untrashing?

That is what I did try to avoid at all cost with the two-marker 
approach, as locking the namespace might be rather invasive. But if that 
does not work out as intended, I see no other way as to add exclusive 
locking for namespaces as well, yes.

> 
> I assume those are fairly rare occurrences, I haven't yet analyzed the
> interactions here to see whether the two-marker approach is actually
> race-free..
> 
> OTOH, do we really need to (be able to) trash namespaces?

Yes, I think we do need that as well since the datastore's hierarchy 
should remain in place, and the namespace iterator requires a way to 
distinguish between a namespace which has been trashed/deleted and a 
namespace which has not, but might contain trashed items. Otherwise a 
user requesting to forget a namespace, still sees the (empty as only 
trashed contents) namespace tree after the operation. Which would be 
rather unexpected?

> 
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>>   pbs-datastore/src/datastore.rs | 135 +++++++++++++++++++++++++++++----
>>   src/api2/admin/namespace.rs    |   2 +-
>>   src/server/pull.rs             |   2 +-
>>   3 files changed, 123 insertions(+), 16 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 023a6a12e..b1d81e199 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -28,7 +28,9 @@ use pbs_api_types::{
>>   };
>>   use pbs_config::BackupLockGuard;
>>   
>> -use crate::backup_info::{BackupDir, BackupGroup, BackupInfo, ListBackupFilter, OLD_LOCKING};
>> +use crate::backup_info::{
>> +    BackupDir, BackupGroup, BackupInfo, ListBackupFilter, OLD_LOCKING, TRASH_MARKER_FILENAME,
>> +};
>>   use crate::chunk_store::ChunkStore;
>>   use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
>>   use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
>> @@ -42,6 +44,8 @@ use crate::DataBlob;
>>   static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
>>       LazyLock::new(|| Mutex::new(HashMap::new()));
>>   
>> +const TRASH_PENDING_MARKER_FILENAME: &str = ".pending";
>> +
>>   /// checks if auth_id is owner, or, if owner is a token, if
>>   /// auth_id is the user of the token
>>   pub fn check_backup_owner(owner: &Authid, auth_id: &Authid) -> Result<(), Error> {
>> @@ -554,6 +558,7 @@ impl DataStore {
>>           ns_full_path.push(ns.path());
>>   
>>           std::fs::create_dir_all(ns_full_path)?;
>> +        self.untrash_namespace(&ns)?;
>>   
>>           Ok(ns)
>>       }
>> @@ -565,6 +570,34 @@ impl DataStore {
>>           path.exists()
>>       }
>>   
>> +    /// Remove the namespace and all it's parent components from the trash by removing the trash or
>> +    /// trash-pending marker file for each namespace level from deepest to shallowest. Missing files
>> +    /// are ignored.
>> +    pub fn untrash_namespace(&self, namespace: &BackupNamespace) -> Result<(), Error> {
>> +        let mut namespace = namespace.clone();
>> +        while namespace.depth() > 0 {
>> +            let mut trash_file_path = self.base_path();
>> +            trash_file_path.push(namespace.path());
>> +            let mut pending_file_path = trash_file_path.clone();
>> +            pending_file_path.push(TRASH_PENDING_MARKER_FILENAME);
>> +            if let Err(err) = std::fs::remove_file(&pending_file_path) {
>> +                // ignore not found, either not trashed or un-trashed by concurrent operation
>> +                if err.kind() != std::io::ErrorKind::NotFound {
>> +                    bail!("failed to remove trash-pending file {trash_file_path:?}: {err}");
>> +                }
>> +            }
>> +            trash_file_path.push(TRASH_MARKER_FILENAME);
>> +            if let Err(err) = std::fs::remove_file(&trash_file_path) {
>> +                // ignore not found, either not trashed or un-trashed by concurrent operation
>> +                if err.kind() != std::io::ErrorKind::NotFound {
>> +                    bail!("failed to remove trash file {trash_file_path:?}: {err}");
>> +                }
>> +            }
>> +            namespace.pop();
>> +        }
>> +        Ok(())
>> +    }
>> +
>>       /// Remove all backup groups of a single namespace level but not the namespace itself.
>>       ///
>>       /// Does *not* descends into child-namespaces and doesn't remoes the namespace itself either.
>> @@ -574,6 +607,7 @@ impl DataStore {
>>       pub fn remove_namespace_groups(
>>           self: &Arc<Self>,
>>           ns: &BackupNamespace,
>> +        skip_trash: bool,
>>       ) -> Result<(bool, BackupGroupDeleteStats), Error> {
>>           // FIXME: locking? The single groups/snapshots are already protected, so may not be
>>           // necessary (depends on what we all allow to do with namespaces)
>> @@ -583,20 +617,22 @@ impl DataStore {
>>           let mut stats = BackupGroupDeleteStats::default();
>>   
>>           for group in self.iter_backup_groups(ns.to_owned())? {
>> -            let delete_stats = group?.destroy(true)?;
>> +            let delete_stats = group?.destroy(skip_trash)?;
>>               stats.add(&delete_stats);
>>               removed_all_groups = removed_all_groups && delete_stats.all_removed();
>>           }
>>   
>> -        let base_file = std::fs::File::open(self.base_path())?;
>> -        let base_fd = base_file.as_raw_fd();
>> -        for ty in BackupType::iter() {
>> -            let mut ty_dir = ns.path();
>> -            ty_dir.push(ty.to_string());
>> -            // best effort only, but we probably should log the error
>> -            if let Err(err) = unlinkat(Some(base_fd), &ty_dir, UnlinkatFlags::RemoveDir) {
>> -                if err != nix::errno::Errno::ENOENT {
>> -                    log::error!("failed to remove backup type {ty} in {ns} - {err}");
>> +        if skip_trash {
>> +            let base_file = std::fs::File::open(self.base_path())?;
>> +            let base_fd = base_file.as_raw_fd();
>> +            for ty in BackupType::iter() {
>> +                let mut ty_dir = ns.path();
>> +                ty_dir.push(ty.to_string());
>> +                // best effort only, but we probably should log the error
>> +                if let Err(err) = unlinkat(Some(base_fd), &ty_dir, UnlinkatFlags::RemoveDir) {
>> +                    if err != nix::errno::Errno::ENOENT {
>> +                        log::error!("failed to remove backup type {ty} in {ns} - {err}");
>> +                    }
>>                   }
>>               }
>>           }
>> @@ -613,6 +649,7 @@ impl DataStore {
>>           self: &Arc<Self>,
>>           ns: &BackupNamespace,
>>           delete_groups: bool,
>> +        skip_trash: bool,
>>       ) -> Result<(bool, BackupGroupDeleteStats), Error> {
>>           let store = self.name();
>>           let mut removed_all_requested = true;
>> @@ -620,16 +657,68 @@ impl DataStore {
>>           if delete_groups {
>>               log::info!("removing whole namespace recursively below {store}:/{ns}",);
>>               for ns in self.recursive_iter_backup_ns(ns.to_owned(), NamespaceListFilter::Active)? {
>> -                let (removed_ns_groups, delete_stats) = self.remove_namespace_groups(&ns?)?;
>> +                let namespace = ns?;
>> +
>> +                if !skip_trash {
>> +                    let mut path = self.base_path();
>> +                    path.push(namespace.path());
>> +                    path.push(TRASH_PENDING_MARKER_FILENAME);
>> +                    if let Err(err) = std::fs::File::create(&path) {
> 
> pending marker created here iff delete_groups && !skip_trash
> 
>> +                        if err.kind() != std::io::ErrorKind::AlreadyExists {
>> +                            return Err(err).context("failed to set trash pending marker file");
>> +                        }
>> +                    }
>> +                }
>> +
>> +                let (removed_ns_groups, delete_stats) =
>> +                    self.remove_namespace_groups(&namespace, skip_trash)?;
>>                   stats.add(&delete_stats);
>>                   removed_all_requested = removed_all_requested && removed_ns_groups;
>> +
>> +                if !skip_trash && !removed_ns_groups {
>> +                    self.untrash_namespace(&namespace)?;
>> +                }
>>               }
>>           } else {
>>               log::info!("pruning empty namespace recursively below {store}:/{ns}");
>>           }
>>   
>> -        removed_all_requested =
>> -            removed_all_requested && self.destroy_namespace_recursive(ns, delete_groups)?;
>> +        if skip_trash {
>> +            removed_all_requested =
>> +                removed_all_requested && self.destroy_namespace_recursive(ns, delete_groups)?;
>> +            return Ok((removed_all_requested, stats));
> 
> early return here iff skip_trash
> 
>> +        }
>> +
>> +        let mut children = self
>> +            .recursive_iter_backup_ns(ns.to_owned(), NamespaceListFilter::Active)?
>> +            .collect::<Result<Vec<BackupNamespace>, Error>>()?;
>> +
>> +        children.sort_by_key(|b| std::cmp::Reverse(b.depth()));
>> +
>> +        let mut all_trashed = true;
>> +        for ns in children.iter() {
>> +            let mut ns_dir = ns.path();
>> +            ns_dir.push("ns");
>> +
>> +            if !ns.is_root() {
>> +                let mut path = self.base_path();
>> +                path.push(ns.path());
>> +
>> +                let pending_path = path.join(TRASH_PENDING_MARKER_FILENAME);
>> +                path.push(TRASH_MARKER_FILENAME);
>> +                if let Err(err) = std::fs::rename(pending_path, path) {
> 
> pending marker assumed to exist here iff !skip_trash
> 
>> +                    if err.kind() == std::io::ErrorKind::NotFound {
>> +                        all_trashed = false;
>> +                    } else {
>> +                        return Err(err).context("Renaming pending marker to trash marker failed");
>> +                    }
>> +                }
>> +            }
>> +        }
>> +
>> +        if !all_trashed {
>> +            bail!("failed to prune namespace, not empty");
> 
> wrong error returned as a result..

Ah good catch! Indeed the logic does not work when `delete_groups` and 
`skip_trash` is false, and the namespace to delete being empty. Will fix 
this in an upcoming iteration of the patches.

> 
>> +        }
>>   
>>           Ok((removed_all_requested, stats))
>>       }
>> @@ -657,6 +746,24 @@ impl DataStore {
>>               let _ = unlinkat(Some(base_fd), &ns_dir, UnlinkatFlags::RemoveDir);
>>   
>>               if !ns.is_root() {
>> +                let rel_trash_path = ns.path().join(TRASH_MARKER_FILENAME);
>> +                if let Err(err) =
>> +                    unlinkat(Some(base_fd), &rel_trash_path, UnlinkatFlags::NoRemoveDir)
>> +                {
>> +                    if err != nix::errno::Errno::ENOENT {
>> +                        bail!("removing the trash file '{rel_trash_path:?}' failed - {err}")
>> +                    }
>> +                }
>> +                let rel_pending_path = ns.path().join(TRASH_PENDING_MARKER_FILENAME);
>> +                if let Err(err) =
>> +                    unlinkat(Some(base_fd), &rel_pending_path, UnlinkatFlags::NoRemoveDir)
>> +                {
>> +                    if err != nix::errno::Errno::ENOENT {
>> +                        bail!(
>> +                            "removing the trash pending file '{rel_pending_path:?}' failed - {err}"
>> +                        )
>> +                    }
>> +                }
>>                   match unlinkat(Some(base_fd), &ns.path(), UnlinkatFlags::RemoveDir) {
>>                       Ok(()) => log::debug!("removed namespace {ns}"),
>>                       Err(nix::errno::Errno::ENOENT) => {
>> diff --git a/src/api2/admin/namespace.rs b/src/api2/admin/namespace.rs
>> index e5463524a..a12d97883 100644
>> --- a/src/api2/admin/namespace.rs
>> +++ b/src/api2/admin/namespace.rs
>> @@ -166,7 +166,7 @@ pub fn delete_namespace(
>>   
>>       let datastore = DataStore::lookup_datastore(&store, Some(Operation::Write))?;
>>   
>> -    let (removed_all, stats) = datastore.remove_namespace_recursive(&ns, delete_groups)?;
>> +    let (removed_all, stats) = datastore.remove_namespace_recursive(&ns, delete_groups, true)?;
>>       if !removed_all {
>>           let err_msg = if delete_groups {
>>               if datastore.old_locking() {
>> diff --git a/src/server/pull.rs b/src/server/pull.rs
>> index d3c6fcf6a..a60ccbf10 100644
>> --- a/src/server/pull.rs
>> +++ b/src/server/pull.rs
>> @@ -729,7 +729,7 @@ fn check_and_remove_ns(params: &PullParameters, local_ns: &BackupNamespace) -> R
>>       let (removed_all, _delete_stats) = params
>>           .target
>>           .store
>> -        .remove_namespace_recursive(local_ns, true)?;
>> +        .remove_namespace_recursive(local_ns, true, true)?;
>>   
>>       Ok(removed_all)
>>   }
>> -- 
>> 2.39.5
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 





More information about the pbs-devel mailing list