[pbs-devel] [PATCH proxmox-backup v5 3/4] partial fix #6049: datastore: use config fast-path in Drop
Fabian Grünbichler
f.gruenbichler at proxmox.com
Wed Nov 26 16:15:50 CET 2025
On November 24, 2025 6:04 pm, Samuel Rufinatscha wrote:
> The Drop impl of DataStore re-read datastore.cfg to decide whether
> the entry should be evicted from the in-process cache (based on
> maintenance mode’s clear_from_cache). During the investigation of
> issue #6049 [1], a flamegraph [2] showed that the config reload in Drop
> accounted for a measurable share of CPU time under load.
>
> This patch wires the datastore config fast path to the Drop
> impl to eventually avoid an expensive config reload from disk to capture
> the maintenance mandate. Also, to ensure the Drop handlers will detect
> that a newer config exists / to mitigate usage of an eventually stale
> cached entry, generation will not only be bumped on config save, but also
> on re-read of the config file (slow path), if `update_cache = true`.
>
> Links
>
> [1] Bugzilla: https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> [2] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>
> Fixes: #6049
> Signed-off-by: Samuel Rufinatscha <s.rufinatscha at proxmox.com>
> ---
> Changes:
>
> From v1 → v2
> - Replace caching logic with the datastore_section_config_cached()
> helper.
>
> From v2 → v3
> No changes
>
> From v3 → v4, thanks @Fabian
> - Pass datastore_section_config_cached(false) in Drop to avoid
> concurrent cache updates.
>
> From v4 → v5
> - Rebased only, no changes
>
> pbs-datastore/src/datastore.rs | 60 ++++++++++++++++++++++++++--------
> 1 file changed, 47 insertions(+), 13 deletions(-)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index c9cb5d65..7638a899 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -225,15 +225,40 @@ impl Drop for DataStore {
> // remove datastore from cache iff
> // - last task finished, and
> // - datastore is in a maintenance mode that mandates it
> - let remove_from_cache = last_task
> - && pbs_config::datastore::config()
> - .and_then(|(s, _)| s.lookup::<DataStoreConfig>("datastore", self.name()))
> - .is_ok_and(|c| {
> - c.get_maintenance_mode()
> - .is_some_and(|m| m.clear_from_cache())
> - });
old code here ignored parsing/locking/.. issues and just assumed if no
config can be obtained nothing should be done..
> -
> - if remove_from_cache {
> +
> + // first check: check if last task finished
> + if !last_task {
> + return;
> + }
> +
> + let (section_config, _gen) = match datastore_section_config_cached(false) {
> + Ok(v) => v,
> + Err(err) => {
> + log::error!(
> + "failed to load datastore config in Drop for {} - {err}",
> + self.name()
> + );
> + return;
> + }
> + };
> +
> + let datastore_cfg: DataStoreConfig =
> + match section_config.lookup("datastore", self.name()) {
> + Ok(cfg) => cfg,
> + Err(err) => {
> + log::error!(
> + "failed to look up datastore '{}' in Drop - {err}",
> + self.name()
> + );
> + return;
here we now have fancy error logging ;) which can be fine, but if we go
from silently ignoring errors to logging them at error level that should
be mentioned to make it clear that it is intentional.
besides that, the second error here means that the datastore was removed
from the config in the meantime.. in which case we should probably
remove it from the map as well, if is still there, even though we can't
check the maintenance mode..
> + }
> + };
> +
> + // second check: check maintenance mode mandate
what is a "maintenance mode mandate"? ;)
keeping it simple, why not just
// check if maintenance mode requires closing FDs
> + if datastore_cfg
> + .get_maintenance_mode()
> + .is_some_and(|m| m.clear_from_cache())
> + {
> DATASTORE_MAP.lock().unwrap().remove(self.name());
> }
> }
> @@ -307,12 +332,12 @@ impl DatastoreThreadSettings {
> /// - If the cached generation matches the current generation, the
> /// cached config is returned.
> /// - Otherwise the config is re-read from disk. If `update_cache` is
> -/// `true`, the new config and current generation are stored in the
> +/// `true`, the new config and bumped generation are stored in the
> /// cache. Callers that set `update_cache = true` must hold the
> /// datastore config lock to avoid racing with concurrent config
> /// changes.
> /// - If `update_cache` is `false`, the freshly read config is returned
> -/// but the cache is left unchanged.
> +/// but the cache and generation are left unchanged.
> ///
> /// If `ConfigVersionCache` is not available, the config is always read
> /// from disk and `None` is returned as the generation.
> @@ -333,14 +358,23 @@ fn datastore_section_config_cached(
does this part here make any sense in this patch?
we don't check the generation in the Drop handler anyway, so it will get
the latest cached version, no matter what?
we'd only end up in this part of the code via lookup_datastore, and only
if:
- the previous cached entry and the current one have a different
generation -> no need to bump again, the cache is already invalidated
- there is no previous cached entry -> nothing to invalidate
I think this part should move to the next patch..
> let (config_raw, _digest) = pbs_config::datastore::config()?;
> let config = Arc::new(config_raw);
>
> + let mut effective_gen = current_gen;
> if update_cache {
> + // Bump the generation. This ensures that Drop
> + // handlers will detect that a newer config exists
> + // and will not rely on a stale cached entry for
> + // maintenance mandate.
> + let prev_gen = version_cache.increase_datastore_generation();
> + effective_gen = prev_gen + 1;
> +
> + // Persist
> *config_cache = Some(DatastoreConfigCache {
> config: config.clone(),
> - last_generation: current_gen,
> + last_generation: effective_gen,
> });
> }
>
> - Ok((config, Some(current_gen)))
> + Ok((config, Some(effective_gen)))
> } else {
> // Fallback path, no config version cache: read datastore.cfg and return None as generation
> *config_cache = None;
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
More information about the pbs-devel
mailing list