[pbs-devel] [PATCH proxmox-backup 2/2] api: backup: never hold mutex guard when doing manifest update

Fabian Grünbichler f.gruenbichler at proxmox.com
Thu Sep 25 14:46:07 CEST 2025


On September 24, 2025 4:56 pm, Christian Ebner wrote:
> An manifest update with s3 backend will call async code, which must
> be avoided because of possible deadlocks [0]. Therefore, perform all
> changes on the shared backup state and drop the guard before
> updating the manifest, which performs the backend specific update.
> Dropping the guard prematurely is fine, as the state has already been
> set to be finished, so no other api calls belonging to the same
> backup task cannot perform further changes anyways.
> 
> [0] https://docs.rs/tokio/latest/tokio/sync/struct.Mutex.html#which-kind-of-mutex-should-you-use
> 
> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
> ---
>  src/api2/backup/environment.rs | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
> index e535891a4..073027c51 100644
> --- a/src/api2/backup/environment.rs
> +++ b/src/api2/backup/environment.rs
> @@ -682,8 +682,15 @@ impl BackupEnvironment {
>              }
>          }
>  
> -        // check for valid manifest and store stats
>          let stats = serde_json::to_value(state.backup_stat)?;
> +
> +        // marks the backup state as finished, so no other api calls can modify its state anymore
> +        state.finished = true;

marking it as finished (which prevents cleanup in case the client
connection disappears!)

> +        // never hold mutex guard during s3 upload due to possible deadlocks
> +        drop(state);
> +
> +        // check for valid manifest and store stats
>          self.backup_dir
>              .update_manifest(&self.backend, |manifest| {
>                  manifest.unprotected["chunk_upload_stats"] = stats;
> @@ -692,9 +699,6 @@ impl BackupEnvironment {
>  
>          self.datastore.try_ensure_sync_level()?;

before this has been called seems kind of dangerous?

why not update the manifest up front, then lock the state etc.? or

lock
do_some_checks
mark_as_finishing (new state that needs to be checked in some places)
drop state
update_manifest
lock
do_checks_again
mark_as_finished

? that way it should be race-free but still safe..

> -        // marks the backup as successful
> -        state.finished = true;
> -
>          Ok(())
>      }
>  
> -- 
> 2.47.3
> 
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
> 
> 




More information about the pbs-devel mailing list