[pbs-devel] [PATCH proxmox-backup] fix #3336: api: remove backup group if the last snapshot is removed

Thomas Lamprecht t.lamprecht at proxmox.com
Fri Mar 11 13:20:22 CET 2022


On 09.03.22 14:50, Stefan Sterz wrote:
> Signed-off-by: Stefan Sterz <s.sterz at proxmox.com>
> ---
>  pbs-datastore/src/datastore.rs | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index d416c8d8..623b7688 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -346,6 +346,28 @@ impl DataStore {
>                  )
>              })?;
>  
> +        // check if this was the last snapshot and if so remove the group
> +        if backup_dir
> +            .group()
> +            .list_backups(&self.base_path())?
> +            .is_empty()
> +        {

a log::info could be appropriate in the "success" (i.e., delete dir) case.

I'd factor the this block below out into a non-pub (or pub(crate)) remove_empty_group_dir fn.

> +            let group_path = self.group_path(backup_dir.group());
> +            let _guard = proxmox_sys::fs::lock_dir_noblock(
> +                &group_path,
> +                "backup group",
> +                "possible running backup",
> +            )?;
> +
> +            std::fs::remove_dir_all(&group_path).map_err(|err| {

this is still unsafe as there's a TOCTOU race, the lock does not protects you from the
following sequence with two threads/async-excutions t1 and t1

t1.1 snapshot deleted
t1.2 empty dir check holds up, entering "delete group dir" code branch
t2.1                                        create new snapshot in group -> lock group dir
t2.2                                        finish new snapshot in group -> unlock group dir
t1.3 lock group dir
t1.4 delete all files, including the new snapshot made in-between.

Rather, just use the safer "remove_dir" variant, that way the TOCTOU race doesn't matters,
the check merely becomes a short cut; if we'd explicitly check for
  `err.kind() != ErrorKind::DirectoryNotEmpty
and silent it we could even do away with the check, should result in the same amount of
syscalls in the best-case (one rmdir vs. one readir) and can be better on success
(readdir + rmdir vs. rmdir only), not that perfromance matters much in this case.

fyi, "remove_backup_group", the place where I think you copied this part, can use the
remove_dir_all safely because there's no check to made there, so no TOCTOU.

> +                format_err!(
> +                    "removing backup group directory {:?} failed - {}",
> +                    group_path,
> +                    err,
> +                )
> +            })?;
> +        }
> +
>          // the manifest does not exists anymore, we do not need to keep the lock
>          if let Ok(path) = self.manifest_lock_path(backup_dir) {
>              // ignore errors






More information about the pbs-devel mailing list