[pbs-devel] [RFC v2 proxmox-backup 10/21] datastore: mark namespace as trash instead of deleting it
Christian Ebner
c.ebner at proxmox.com
Mon May 12 09:47:59 CEST 2025
On 5/9/25 14:27, Fabian Grünbichler wrote:
> On May 8, 2025 3:05 pm, Christian Ebner wrote:
>> As for backup snapshots and groups, mark the namespace as trash
>> instead of removing it and the contents right away, if the trash
>> should not be bypassed.
>>
>> Actual removal of the hirarchical folder structure has to be taken
>> care of by the garbage collection.
>>
>> In order to avoid races during removal, first mark the namespaces as
>> trash pending, mark the snapshots and groups as trash and only after
>> rename the pending marker file to the trash marker file. By this,
>> concurrent backups can remove the trash pending marker to avoid the
>> namespace being trashed.
>>
>> On re-creation of a trashed namespace remove the marker file on itself
>> and any parent component from deepest to shallowest. As trashing a full
>> namespace can also set the trash pending state for recursive namespace
>> cleanup, remove encounters of that marker file as well to avoid the
>> namespace or its parent being trashed.
>
> this is fairly involved since we don't have locks on namespaces..
>
> should we have them for creation/removal/trashing/untrashing?
That is what I did try to avoid at all cost with the two-marker
approach, as locking the namespace might be rather invasive. But if that
does not work out as intended, I see no other way as to add exclusive
locking for namespaces as well, yes.
>
> I assume those are fairly rare occurrences, I haven't yet analyzed the
> interactions here to see whether the two-marker approach is actually
> race-free..
>
> OTOH, do we really need to (be able to) trash namespaces?
Yes, I think we do need that as well since the datastore's hierarchy
should remain in place, and the namespace iterator requires a way to
distinguish between a namespace which has been trashed/deleted and a
namespace which has not, but might contain trashed items. Otherwise a
user requesting to forget a namespace, still sees the (empty as only
trashed contents) namespace tree after the operation. Which would be
rather unexpected?
>
>>
>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>> ---
>> pbs-datastore/src/datastore.rs | 135 +++++++++++++++++++++++++++++----
>> src/api2/admin/namespace.rs | 2 +-
>> src/server/pull.rs | 2 +-
>> 3 files changed, 123 insertions(+), 16 deletions(-)
>>
>> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
>> index 023a6a12e..b1d81e199 100644
>> --- a/pbs-datastore/src/datastore.rs
>> +++ b/pbs-datastore/src/datastore.rs
>> @@ -28,7 +28,9 @@ use pbs_api_types::{
>> };
>> use pbs_config::BackupLockGuard;
>>
>> -use crate::backup_info::{BackupDir, BackupGroup, BackupInfo, ListBackupFilter, OLD_LOCKING};
>> +use crate::backup_info::{
>> + BackupDir, BackupGroup, BackupInfo, ListBackupFilter, OLD_LOCKING, TRASH_MARKER_FILENAME,
>> +};
>> use crate::chunk_store::ChunkStore;
>> use crate::dynamic_index::{DynamicIndexReader, DynamicIndexWriter};
>> use crate::fixed_index::{FixedIndexReader, FixedIndexWriter};
>> @@ -42,6 +44,8 @@ use crate::DataBlob;
>> static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
>> LazyLock::new(|| Mutex::new(HashMap::new()));
>>
>> +const TRASH_PENDING_MARKER_FILENAME: &str = ".pending";
>> +
>> /// checks if auth_id is owner, or, if owner is a token, if
>> /// auth_id is the user of the token
>> pub fn check_backup_owner(owner: &Authid, auth_id: &Authid) -> Result<(), Error> {
>> @@ -554,6 +558,7 @@ impl DataStore {
>> ns_full_path.push(ns.path());
>>
>> std::fs::create_dir_all(ns_full_path)?;
>> + self.untrash_namespace(&ns)?;
>>
>> Ok(ns)
>> }
>> @@ -565,6 +570,34 @@ impl DataStore {
>> path.exists()
>> }
>>
>> + /// Remove the namespace and all it's parent components from the trash by removing the trash or
>> + /// trash-pending marker file for each namespace level from deepest to shallowest. Missing files
>> + /// are ignored.
>> + pub fn untrash_namespace(&self, namespace: &BackupNamespace) -> Result<(), Error> {
>> + let mut namespace = namespace.clone();
>> + while namespace.depth() > 0 {
>> + let mut trash_file_path = self.base_path();
>> + trash_file_path.push(namespace.path());
>> + let mut pending_file_path = trash_file_path.clone();
>> + pending_file_path.push(TRASH_PENDING_MARKER_FILENAME);
>> + if let Err(err) = std::fs::remove_file(&pending_file_path) {
>> + // ignore not found, either not trashed or un-trashed by concurrent operation
>> + if err.kind() != std::io::ErrorKind::NotFound {
>> + bail!("failed to remove trash-pending file {trash_file_path:?}: {err}");
>> + }
>> + }
>> + trash_file_path.push(TRASH_MARKER_FILENAME);
>> + if let Err(err) = std::fs::remove_file(&trash_file_path) {
>> + // ignore not found, either not trashed or un-trashed by concurrent operation
>> + if err.kind() != std::io::ErrorKind::NotFound {
>> + bail!("failed to remove trash file {trash_file_path:?}: {err}");
>> + }
>> + }
>> + namespace.pop();
>> + }
>> + Ok(())
>> + }
>> +
>> /// Remove all backup groups of a single namespace level but not the namespace itself.
>> ///
>> /// Does *not* descends into child-namespaces and doesn't remoes the namespace itself either.
>> @@ -574,6 +607,7 @@ impl DataStore {
>> pub fn remove_namespace_groups(
>> self: &Arc<Self>,
>> ns: &BackupNamespace,
>> + skip_trash: bool,
>> ) -> Result<(bool, BackupGroupDeleteStats), Error> {
>> // FIXME: locking? The single groups/snapshots are already protected, so may not be
>> // necessary (depends on what we all allow to do with namespaces)
>> @@ -583,20 +617,22 @@ impl DataStore {
>> let mut stats = BackupGroupDeleteStats::default();
>>
>> for group in self.iter_backup_groups(ns.to_owned())? {
>> - let delete_stats = group?.destroy(true)?;
>> + let delete_stats = group?.destroy(skip_trash)?;
>> stats.add(&delete_stats);
>> removed_all_groups = removed_all_groups && delete_stats.all_removed();
>> }
>>
>> - let base_file = std::fs::File::open(self.base_path())?;
>> - let base_fd = base_file.as_raw_fd();
>> - for ty in BackupType::iter() {
>> - let mut ty_dir = ns.path();
>> - ty_dir.push(ty.to_string());
>> - // best effort only, but we probably should log the error
>> - if let Err(err) = unlinkat(Some(base_fd), &ty_dir, UnlinkatFlags::RemoveDir) {
>> - if err != nix::errno::Errno::ENOENT {
>> - log::error!("failed to remove backup type {ty} in {ns} - {err}");
>> + if skip_trash {
>> + let base_file = std::fs::File::open(self.base_path())?;
>> + let base_fd = base_file.as_raw_fd();
>> + for ty in BackupType::iter() {
>> + let mut ty_dir = ns.path();
>> + ty_dir.push(ty.to_string());
>> + // best effort only, but we probably should log the error
>> + if let Err(err) = unlinkat(Some(base_fd), &ty_dir, UnlinkatFlags::RemoveDir) {
>> + if err != nix::errno::Errno::ENOENT {
>> + log::error!("failed to remove backup type {ty} in {ns} - {err}");
>> + }
>> }
>> }
>> }
>> @@ -613,6 +649,7 @@ impl DataStore {
>> self: &Arc<Self>,
>> ns: &BackupNamespace,
>> delete_groups: bool,
>> + skip_trash: bool,
>> ) -> Result<(bool, BackupGroupDeleteStats), Error> {
>> let store = self.name();
>> let mut removed_all_requested = true;
>> @@ -620,16 +657,68 @@ impl DataStore {
>> if delete_groups {
>> log::info!("removing whole namespace recursively below {store}:/{ns}",);
>> for ns in self.recursive_iter_backup_ns(ns.to_owned(), NamespaceListFilter::Active)? {
>> - let (removed_ns_groups, delete_stats) = self.remove_namespace_groups(&ns?)?;
>> + let namespace = ns?;
>> +
>> + if !skip_trash {
>> + let mut path = self.base_path();
>> + path.push(namespace.path());
>> + path.push(TRASH_PENDING_MARKER_FILENAME);
>> + if let Err(err) = std::fs::File::create(&path) {
>
> pending marker created here iff delete_groups && !skip_trash
>
>> + if err.kind() != std::io::ErrorKind::AlreadyExists {
>> + return Err(err).context("failed to set trash pending marker file");
>> + }
>> + }
>> + }
>> +
>> + let (removed_ns_groups, delete_stats) =
>> + self.remove_namespace_groups(&namespace, skip_trash)?;
>> stats.add(&delete_stats);
>> removed_all_requested = removed_all_requested && removed_ns_groups;
>> +
>> + if !skip_trash && !removed_ns_groups {
>> + self.untrash_namespace(&namespace)?;
>> + }
>> }
>> } else {
>> log::info!("pruning empty namespace recursively below {store}:/{ns}");
>> }
>>
>> - removed_all_requested =
>> - removed_all_requested && self.destroy_namespace_recursive(ns, delete_groups)?;
>> + if skip_trash {
>> + removed_all_requested =
>> + removed_all_requested && self.destroy_namespace_recursive(ns, delete_groups)?;
>> + return Ok((removed_all_requested, stats));
>
> early return here iff skip_trash
>
>> + }
>> +
>> + let mut children = self
>> + .recursive_iter_backup_ns(ns.to_owned(), NamespaceListFilter::Active)?
>> + .collect::<Result<Vec<BackupNamespace>, Error>>()?;
>> +
>> + children.sort_by_key(|b| std::cmp::Reverse(b.depth()));
>> +
>> + let mut all_trashed = true;
>> + for ns in children.iter() {
>> + let mut ns_dir = ns.path();
>> + ns_dir.push("ns");
>> +
>> + if !ns.is_root() {
>> + let mut path = self.base_path();
>> + path.push(ns.path());
>> +
>> + let pending_path = path.join(TRASH_PENDING_MARKER_FILENAME);
>> + path.push(TRASH_MARKER_FILENAME);
>> + if let Err(err) = std::fs::rename(pending_path, path) {
>
> pending marker assumed to exist here iff !skip_trash
>
>> + if err.kind() == std::io::ErrorKind::NotFound {
>> + all_trashed = false;
>> + } else {
>> + return Err(err).context("Renaming pending marker to trash marker failed");
>> + }
>> + }
>> + }
>> + }
>> +
>> + if !all_trashed {
>> + bail!("failed to prune namespace, not empty");
>
> wrong error returned as a result..
Ah good catch! Indeed the logic does not work when `delete_groups` and
`skip_trash` is false, and the namespace to delete being empty. Will fix
this in an upcoming iteration of the patches.
>
>> + }
>>
>> Ok((removed_all_requested, stats))
>> }
>> @@ -657,6 +746,24 @@ impl DataStore {
>> let _ = unlinkat(Some(base_fd), &ns_dir, UnlinkatFlags::RemoveDir);
>>
>> if !ns.is_root() {
>> + let rel_trash_path = ns.path().join(TRASH_MARKER_FILENAME);
>> + if let Err(err) =
>> + unlinkat(Some(base_fd), &rel_trash_path, UnlinkatFlags::NoRemoveDir)
>> + {
>> + if err != nix::errno::Errno::ENOENT {
>> + bail!("removing the trash file '{rel_trash_path:?}' failed - {err}")
>> + }
>> + }
>> + let rel_pending_path = ns.path().join(TRASH_PENDING_MARKER_FILENAME);
>> + if let Err(err) =
>> + unlinkat(Some(base_fd), &rel_pending_path, UnlinkatFlags::NoRemoveDir)
>> + {
>> + if err != nix::errno::Errno::ENOENT {
>> + bail!(
>> + "removing the trash pending file '{rel_pending_path:?}' failed - {err}"
>> + )
>> + }
>> + }
>> match unlinkat(Some(base_fd), &ns.path(), UnlinkatFlags::RemoveDir) {
>> Ok(()) => log::debug!("removed namespace {ns}"),
>> Err(nix::errno::Errno::ENOENT) => {
>> diff --git a/src/api2/admin/namespace.rs b/src/api2/admin/namespace.rs
>> index e5463524a..a12d97883 100644
>> --- a/src/api2/admin/namespace.rs
>> +++ b/src/api2/admin/namespace.rs
>> @@ -166,7 +166,7 @@ pub fn delete_namespace(
>>
>> let datastore = DataStore::lookup_datastore(&store, Some(Operation::Write))?;
>>
>> - let (removed_all, stats) = datastore.remove_namespace_recursive(&ns, delete_groups)?;
>> + let (removed_all, stats) = datastore.remove_namespace_recursive(&ns, delete_groups, true)?;
>> if !removed_all {
>> let err_msg = if delete_groups {
>> if datastore.old_locking() {
>> diff --git a/src/server/pull.rs b/src/server/pull.rs
>> index d3c6fcf6a..a60ccbf10 100644
>> --- a/src/server/pull.rs
>> +++ b/src/server/pull.rs
>> @@ -729,7 +729,7 @@ fn check_and_remove_ns(params: &PullParameters, local_ns: &BackupNamespace) -> R
>> let (removed_all, _delete_stats) = params
>> .target
>> .store
>> - .remove_namespace_recursive(local_ns, true)?;
>> + .remove_namespace_recursive(local_ns, true, true)?;
>>
>> Ok(removed_all)
>> }
>> --
>> 2.39.5
>>
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel at lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
>>
>>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
More information about the pbs-devel
mailing list