[pbs-devel] [PATCH proxmox-backup 2/3] backup env: keep a shared chunk store lock for duration of backup
Fabian Grünbichler
f.gruenbichler at proxmox.com
Wed Oct 1 13:19:10 CEST 2025
instead of relying on individual index writers obtaining it.
this closes a race window by extending the existence of "backup writers" (as
defined for GC cutoff calculation purposes) from start of the backup task until
the backup is finished and no more index writers exist by definition.
the race windows looked like this:
backup session is started
backup-proxy is reloaded
...
index writer (in old process!) is done and releases shared lock on the chunk store
new proxy process starts a GC which takes exclusive lock
backup in old process fails to obtain shared lock for next index and is aborted
or this:
backup session is started
backup-proxy is reloaded
...
backup session is still busy downloading previous snapshot
new proxy process starts a GC which takes exclusive lock
backup in old process fails to obtain shared lock for first index and is aborted
since the desired semantics for GC and backup sessions is for backup sessions
to have higher priority, this is not desired behaviour.
GC itself is still safe, as it only looks at indices and not the manifest, so
the third race window (no shared lock being held by the backup env inbetween
last index writer being closed and the session itself being finished) has no
practical effect.
Signed-off-by: Fabian Grünbichler <f.gruenbichler at proxmox.com>
---
Notes:
next patch will drop the locks being held by the index writer, since their only
use case was protecting the backup session, which is now covered by the backup
env itself..
src/api2/backup/environment.rs | 5 +++++
src/api2/backup/mod.rs | 4 +++-
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/src/api2/backup/environment.rs b/src/api2/backup/environment.rs
index fa1444ab1..8dc274a29 100644
--- a/src/api2/backup/environment.rs
+++ b/src/api2/backup/environment.rs
@@ -1,5 +1,6 @@
use anyhow::{bail, format_err, Context, Error};
use pbs_config::BackupLockGuard;
+use proxmox_sys::process_locker::ProcessLockSharedGuard;
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
@@ -103,6 +104,7 @@ pub struct BackupLockGuards {
previous_snapshot: Option<BackupLockGuard>,
group: Option<BackupLockGuard>,
snapshot: Option<BackupLockGuard>,
+ chunk_store: Option<ProcessLockSharedGuard>,
}
impl BackupLockGuards {
@@ -110,11 +112,13 @@ impl BackupLockGuards {
previous_snapshot: Option<BackupLockGuard>,
group: BackupLockGuard,
snapshot: BackupLockGuard,
+ chunk_store: ProcessLockSharedGuard,
) -> Self {
Self {
previous_snapshot,
group: Some(group),
snapshot: Some(snapshot),
+ chunk_store: Some(chunk_store),
}
}
}
@@ -744,6 +748,7 @@ impl BackupEnvironment {
}
// drop previous snapshot lock
state.backup_lock_guards.previous_snapshot.take();
+ state.backup_lock_guards.chunk_store.take();
let stats = serde_json::to_value(state.backup_stat)?;
diff --git a/src/api2/backup/mod.rs b/src/api2/backup/mod.rs
index 246b0946d..e15d813a2 100644
--- a/src/api2/backup/mod.rs
+++ b/src/api2/backup/mod.rs
@@ -214,7 +214,9 @@ fn upgrade_to_backup_protocol(
// Drop them on successful backup finish or when dropping the env after cleanup in
// case of errors. The former is required for immediate subsequent backups (e.g.
// during a push sync) to be able to lock the group and snapshots.
- let backup_lock_guards = BackupLockGuards::new(last_guard, group_guard, snap_guard);
+ let chunk_store_guard = datastore.try_shared_chunk_store_lock()?;
+ let backup_lock_guards =
+ BackupLockGuards::new(last_guard, group_guard, snap_guard, chunk_store_guard);
let mut env = BackupEnvironment::new(
env_type,
--
2.47.3
More information about the pbs-devel
mailing list