[pbs-devel] [PATCH proxmox-backup v2 4/4] partial fix #6049: datastore: add TTL fallback to catch manual config edits
Fabian Grünbichler
f.gruenbichler at proxmox.com
Wed Nov 19 14:24:36 CET 2025
On November 14, 2025 4:05 pm, Samuel Rufinatscha wrote:
> The lookup fast path reacts to API-driven config changes because
> save_config() bumps the generation. Manual edits of datastore.cfg do
> not bump the counter. To keep the system robust against such edits
> without reintroducing config reading and hashing on the hot path, this
> patch adds a TTL to the cache entry.
>
> If the cached config is older than
> DATASTORE_CONFIG_CACHE_TTL_SECS (set to 60s), the next lookup takes
> the slow path and refreshes the cached entry. Within
> the TTL window, unchanged generations still use the fast path.
>
> Links
>
> [1] cargo-flamegraph: https://github.com/flamegraph-rs/flamegraph
>
> Refs: #6049
> Signed-off-by: Samuel Rufinatscha <s.rufinatscha at proxmox.com>
> ---
> pbs-datastore/src/datastore.rs | 46 +++++++++++++++++++++++++---------
> 1 file changed, 34 insertions(+), 12 deletions(-)
>
> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> index 0fabf592..7a18435c 100644
> --- a/pbs-datastore/src/datastore.rs
> +++ b/pbs-datastore/src/datastore.rs
> @@ -22,7 +22,7 @@ use proxmox_sys::error::SysError;
> use proxmox_sys::fs::{file_read_optional_string, replace_file, CreateOptions};
> use proxmox_sys::linux::procfs::MountInfo;
> use proxmox_sys::process_locker::{ProcessLockExclusiveGuard, ProcessLockSharedGuard};
> -use proxmox_time::TimeSpan;
> +use proxmox_time::{epoch_i64, TimeSpan};
> use proxmox_worker_task::WorkerTaskContext;
>
> use pbs_api_types::{
> @@ -53,6 +53,8 @@ struct DatastoreConfigCache {
> config: Arc<SectionConfigData>,
> // Generation number from ConfigVersionCache
> last_generation: usize,
> + // Last update time (epoch seconds)
> + last_update: i64,
> }
>
> static DATASTORE_CONFIG_CACHE: LazyLock<Mutex<Option<DatastoreConfigCache>>> =
> @@ -61,6 +63,8 @@ static DATASTORE_CONFIG_CACHE: LazyLock<Mutex<Option<DatastoreConfigCache>>> =
> static DATASTORE_MAP: LazyLock<Mutex<HashMap<String, Arc<DataStoreImpl>>>> =
> LazyLock::new(|| Mutex::new(HashMap::new()));
>
> +/// Max age in seconds to reuse the cached datastore config.
> +const DATASTORE_CONFIG_CACHE_TTL_SECS: i64 = 60;
> /// Filename to store backup group notes
> pub const GROUP_NOTES_FILE_NAME: &str = "notes";
> /// Filename to store backup group owner
> @@ -295,16 +299,22 @@ impl DatastoreBackend {
>
> /// Return the cached datastore SectionConfig and its generation.
> fn datastore_section_config_cached() -> Result<(Arc<SectionConfigData>, Option<usize>), Error> {
> - let gen = ConfigVersionCache::new()
> - .ok()
> - .map(|c| c.datastore_generation());
> + let now = epoch_i64();
> + let version_cache = ConfigVersionCache::new().ok();
> + let current_gen = version_cache.as_ref().map(|c| c.datastore_generation());
>
> let mut guard = DATASTORE_CONFIG_CACHE.lock().unwrap();
>
> - // Fast path: re-use cached datastore.cfg
> - if let (Some(gen), Some(cache)) = (gen, guard.as_ref()) {
> - if cache.last_generation == gen {
> - return Ok((cache.config.clone(), Some(gen)));
> + // Fast path: re-use cached datastore.cfg if cache is available, generation matches and TTL not expired
> + if let (Some(current_gen), Some(config_cache)) = (current_gen, guard.as_ref()) {
> + let gen_matches = config_cache.last_generation == current_gen;
> + let ttl_ok = (now - config_cache.last_update) < DATASTORE_CONFIG_CACHE_TTL_SECS;
> +
> + if gen_matches && ttl_ok {
> + return Ok((
> + config_cache.config.clone(),
> + Some(config_cache.last_generation),
> + ));
> }
> }
>
> @@ -312,16 +322,28 @@ fn datastore_section_config_cached() -> Result<(Arc<SectionConfigData>, Option<u
> let (config_raw, _digest) = pbs_config::datastore::config()?;
> let config = Arc::new(config_raw);
>
> - if let Some(gen_val) = gen {
> + // Update cache
> + let new_gen = if let Some(handle) = version_cache {
> + // Bump datastore generation whenever we reload the config.
> + // This ensures that Drop handlers will detect that a newer config exists
> + // and will not rely on a stale cached entry for maintenance mandate.
> + let prev_gen = handle.increase_datastore_generation();
this could be optimized (further) if we keep the digest when we
load+parse the config above, because we only need to bump the generation
if the digest changed. we need to bump the timestamp always of course ;)
also we only want to bump if we previously had a generation saved, if we
didn't, then this is the first load and bumping is meaningless anyway..
but there is another issue here - this is now called in the Drop
handler, where we don't hold the config lock, so we have no guard
against a parallel config change API call that also bumps the generation
between us reloading and us bumping here.. which means we could have a
mismatch between the value in new_gen and the actual config we loaded..
I think we need to extend this helper here with a bool flag that
determines whether we want to reload if the TTL expired, or return
potentially outdated information? *every* lookup will handle the TTL
anyway (by setting that parameter), so I think just fetching the
"freshest" info we can get without reloading (by not setting it) is fine
for the Drop handler..
> + let new_gen = prev_gen + 1;
> +
> *guard = Some(DatastoreConfigCache {
> config: config.clone(),
> - last_generation: gen_val,
> + last_generation: new_gen,
> + last_update: now,
> });
> +
> + Some(new_gen)
> } else {
> + // if the cache was not available, use again the slow path next time
> *guard = None;
> - }
> + None
> + };
>
> - Ok((config, gen))
> + Ok((config, new_gen))
> }
>
> impl DataStore {
> --
> 2.47.3
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
More information about the pbs-devel
mailing list