[pbs-devel] [PATCH proxmox-backup v3 2/4] pbs-config: cache verified API token secrets

Fri Jan 16 16:29:17 CET 2026

Quoting Samuel Rufinatscha (2026-01-16 16:13:17)
> On 1/14/26 11:44 AM, Fabian Grünbichler wrote:
> > On January 2, 2026 5:07 pm, Samuel Rufinatscha wrote:
> >> Currently, every token-based API request reads the token.shadow file and
> >> runs the expensive password hash verification for the given token
> >> secret. This shows up as a hotspot in /status profiling (see
> >> bug #7017 [1]).
> >>
> >> This patch introduces an in-memory cache of successfully verified token
> >> secrets. Subsequent requests for the same token+secret combination only
> >> perform a comparison using openssl::memcmp::eq and avoid re-running the
> >> password hash. The cache is updated when a token secret is set and
> >> cleared when a token is deleted. Note, this does NOT include manual
> >> config changes, which will be covered in a subsequent patch.
> >>
> >> This patch is part of the series which fixes bug #7017 [1].
> >>
> >> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
> >>
> >> Signed-off-by: Samuel Rufinatscha <s.rufinatscha at proxmox.com>
> >> ---
> >> Changes from v1 to v2:
> >>
> >> * Replace OnceCell with LazyLock, and std::sync::RwLock with
> >> parking_lot::RwLock.
> >> * Add API_MUTATION_GENERATION and guard cache inserts
> >> to prevent “zombie inserts” across concurrent set/delete.
> >> * Refactor cache operations into cache_try_secret_matches,
> >> cache_try_insert_secret, and centralize write-side behavior in
> >> apply_api_mutation.
> >> * Switch fast-path cache access to try_read/try_write (best-effort).
> >>
> >> Changes from v2 to v3:
> >>
> >> * Replaced process-local cache invalidation (AtomicU64
> >> API_MUTATION_GENERATION) with a cross-process shared generation via
> >> ConfigVersionCache.
> >> * Validate shared generation before/after the constant-time secret
> >> compare; only insert into cache if the generation is unchanged.
> >> * invalidate_cache_state() on insert if shared generation changed.
> >>
> >>   Cargo.toml                     |   1 +
> >>   pbs-config/Cargo.toml          |   1 +
> >>   pbs-config/src/token_shadow.rs | 157 ++++++++++++++++++++++++++++++++-
> >>   3 files changed, 158 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/Cargo.toml b/Cargo.toml
> >> index 1aa57ae5..821b63b7 100644
> >> --- a/Cargo.toml
> >> +++ b/Cargo.toml
> >> @@ -143,6 +143,7 @@ nom = "7"
> >>   num-traits = "0.2"
> >>   once_cell = "1.3.1"
> >>   openssl = "0.10.40"
> >> +parking_lot = "0.12"
> >>   percent-encoding = "2.1"
> >>   pin-project-lite = "0.2"
> >>   regex = "1.5.5"
> >> diff --git a/pbs-config/Cargo.toml b/pbs-config/Cargo.toml
> >> index 74afb3c6..eb81ce00 100644
> >> --- a/pbs-config/Cargo.toml
> >> +++ b/pbs-config/Cargo.toml
> >> @@ -13,6 +13,7 @@ libc.workspace = true
> >>   nix.workspace = true
> >>   once_cell.workspace = true
> >>   openssl.workspace = true
> >> +parking_lot.workspace = true
> >>   regex.workspace = true
> >>   serde.workspace = true
> >>   serde_json.workspace = true
> >> diff --git a/pbs-config/src/token_shadow.rs b/pbs-config/src/token_shadow.rs
> >> index 640fabbf..fa84aee5 100644
> >> --- a/pbs-config/src/token_shadow.rs
> >> +++ b/pbs-config/src/token_shadow.rs
> >> @@ -1,6 +1,8 @@
> >>   use std::collections::HashMap;
> >> +use std::sync::LazyLock;
> >>   
> >>   use anyhow::{bail, format_err, Error};
> >> +use parking_lot::RwLock;
> >>   use serde::{Deserialize, Serialize};
> >>   use serde_json::{from_value, Value};
> >>   
> >> @@ -13,6 +15,18 @@ use crate::{open_backup_lockfile, BackupLockGuard};
> >>   const LOCK_FILE: &str = pbs_buildcfg::configdir!("/token.shadow.lock");
> >>   const CONF_FILE: &str = pbs_buildcfg::configdir!("/token.shadow");
> >>   
> >> +/// Global in-memory cache for successfully verified API token secrets.
> >> +/// The cache stores plain text secrets for token Authids that have already been
> >> +/// verified against the hashed values in `token.shadow`. This allows for cheap
> >> +/// subsequent authentications for the same token+secret combination, avoiding
> >> +/// recomputing the password hash on every request.
> >> +static TOKEN_SECRET_CACHE: LazyLock<RwLock<ApiTokenSecretCache>> = LazyLock::new(|| {
> >> +    RwLock::new(ApiTokenSecretCache {
> >> +        secrets: HashMap::new(),
> >> +        shared_gen: 0,
> >> +    })
> >> +});
> >> +
> >>   #[derive(Serialize, Deserialize)]
> >>   #[serde(rename_all = "kebab-case")]
> >>   /// ApiToken id / secret pair
> >> @@ -54,9 +68,27 @@ pub fn verify_secret(tokenid: &Authid, secret: &str) -> Result<(), Error> {
> >>           bail!("not an API token ID");
> >>       }
> >>   
> >> +    // Fast path
> >> +    if cache_try_secret_matches(tokenid, secret) {
> >> +        return Ok(());
> >> +    }
> >> +
> >> +    // Slow path
> >> +    // First, capture the shared generation before doing the hash verification.
> >> +    let gen_before = token_shadow_shared_gen();
> >> +
> >>       let data = read_file()?;
> >>       match data.get(tokenid) {
> >> -        Some(hashed_secret) => proxmox_sys::crypt::verify_crypt_pw(secret, hashed_secret),
> >> +        Some(hashed_secret) => {
> >> +            proxmox_sys::crypt::verify_crypt_pw(secret, hashed_secret)?;
> >> +
> >> +            // Try to cache only if nothing changed while verifying the secret.
> >> +            if let Some(gen) = gen_before {
> >> +                cache_try_insert_secret(tokenid.clone(), secret.to_owned(), gen);
> >> +            }
> >> +
> >> +            Ok(())
> >> +        }
> >>           None => bail!("invalid API token"),
> >>       }
> >>   }
> >> @@ -82,6 +114,8 @@ fn set_secret(tokenid: &Authid, secret: &str) -> Result<(), Error> {
> >>       data.insert(tokenid.clone(), hashed_secret);
> >>       write_file(data)?;
> >>   
> >> +    apply_api_mutation(tokenid, Some(secret));
> >> +
> >>       Ok(())
> >>   }
> >>   
> >> @@ -97,5 +131,126 @@ pub fn delete_secret(tokenid: &Authid) -> Result<(), Error> {
> >>       data.remove(tokenid);
> >>       write_file(data)?;
> >>   
> >> +    apply_api_mutation(tokenid, None);
> >> +
> >>       Ok(())
> >>   }
> >> +
> >> +struct ApiTokenSecretCache {
> >> +    /// Keys are token Authids, values are the corresponding plain text secrets.
> >> +    /// Entries are added after a successful on-disk verification in
> >> +    /// `verify_secret` or when a new token secret is generated by
> >> +    /// `generate_and_set_secret`. Used to avoid repeated
> >> +    /// password-hash computation on subsequent authentications.
> >> +    secrets: HashMap<Authid, CachedSecret>,
> >> +    /// Shared generation to detect mutations of the underlying token.shadow file.
> >> +    shared_gen: usize,
> >> +}
> >> +
> >> +/// Cached secret.
> >> +struct CachedSecret {
> >> +    secret: String,
> >> +}
> >> +
> >> +fn cache_try_insert_secret(tokenid: Authid, secret: String, shared_gen_before: usize) {
> >> +    let Some(mut cache) = TOKEN_SECRET_CACHE.try_write() else {
> >> +        return;
> >> +    };
> >> +
> >> +    let Some(shared_gen_now) = token_shadow_shared_gen() else {
> >> +        return;
> >> +    };
> >> +
> >> +    // If this process missed a generation bump, its cache is stale.
> >> +    if cache.shared_gen != shared_gen_now {
> >> +        invalidate_cache_state(&mut cache);
> >> +        cache.shared_gen = shared_gen_now;
> >> +    }
> >> +
> >> +    // If a mutation happened while we were verifying the secret, do not insert.
> >> +    if shared_gen_now == shared_gen_before {
> >> +        cache.secrets.insert(tokenid, CachedSecret { secret });
> >> +    }
> >> +}
> >> +
> >> +// Tries to match the given token secret against the cached secret.
> >> +// Checks the generation before and after the constant-time compare to avoid a
> >> +// TOCTOU window. If another process rotates/deletes a token while we're validating
> >> +// the cached secret, the generation will change, and we
> >> +// must not trust the cache for this request.
> >> +fn cache_try_secret_matches(tokenid: &Authid, secret: &str) -> bool {
> >> +    let Some(cache) = TOKEN_SECRET_CACHE.try_read() else {
> >> +        return false;
> >> +    };
> >> +    let Some(entry) = cache.secrets.get(tokenid) else {
> >> +        return false;
> >> +    };
> >> +
> >> +    let cache_gen = cache.shared_gen;
> >> +
> >> +    let Some(gen1) = token_shadow_shared_gen() else {
> >> +        return false;
> >> +    };
> >> +    if gen1 != cache_gen {
> >> +        return false;
> >> +    }
> >> +
> >> +    let eq = openssl::memcmp::eq(entry.secret.as_bytes(), secret.as_bytes());
> > 
> > should we invalidate the cache here for this particular authid in case
> > of a mismatch, to avoid making brute forcing too easy/cheap?
> >
> 
> We are not doing a cheap reject, in mismatch we do still fall through to
> verify_crypt_pw(). Evicting on mismatch could however enable cache
> thrashing where wrong secrets for a known tokenid would evict cached
> entries. So I think we should not invalidate here on mismatch.
> 
> >> +    let Some(gen2) = token_shadow_shared_gen() else {
> >> +        return false;
> >> +    };
> >> +
> >> +    eq && gen2 == cache_gen
> >> +}
> >> +
> >> +fn apply_api_mutation(tokenid: &Authid, new_secret: Option<&str>) {
> >> +    // Signal cache invalidation to other processes (best-effort).
> >> +    let new_shared_gen = bump_token_shadow_shared_gen();
> >> +
> >> +    let mut cache = TOKEN_SECRET_CACHE.write();
> >> +
> >> +    // If we cannot read/bump the shared generation, we cannot safely trust the cache.
> >> +    let Some(gen) = new_shared_gen else {
> >> +        invalidate_cache_state(&mut cache);
> >> +        cache.shared_gen = 0;
> >> +        return;
> >> +    };
> >> +
> >> +    // Update to the post-mutation generation.
> >> +    cache.shared_gen = gen;
> >> +
> >> +    // Apply the new mutation.
> >> +    match new_secret {
> >> +        Some(secret) => {
> >> +            cache.secrets.insert(
> >> +                tokenid.clone(),
> >> +                CachedSecret {
> >> +                    secret: secret.to_owned(),
> >> +                },
> >> +            );
> >> +        }
> >> +        None => {
> >> +            cache.secrets.remove(tokenid);
> >> +        }
> >> +    }
> >> +}
> >> +
> >> +/// Get the current shared generation.
> >> +fn token_shadow_shared_gen() -> Option<usize> {
> >> +    crate::ConfigVersionCache::new()
> >> +        .ok()
> >> +        .map(|cvc| cvc.token_shadow_generation())
> >> +}
> >> +
> >> +/// Bump and return the new shared generation.
> >> +fn bump_token_shadow_shared_gen() -> Option<usize> {
> >> +    crate::ConfigVersionCache::new()
> >> +        .ok()
> >> +        .map(|cvc| cvc.increase_token_shadow_generation() + 1)
> >> +}
> >> +
> >> +/// Invalidates the cache state and only keeps the shared generation.
> > 
> > both calls to this actually set the cached generation to some value
> > right after, so maybe this should take a generation directly and set it?
> >
> 
> patch 3/4 doesn’t always update the gen on cache invalidation
> (shadow_mtime_len() error branch in apply_api_mutation) but most other
> call sites do. Agreed this can be refactored, maybe:

that one sets the generation before (potentially) invalidating the cache
though, so we could unconditionally reset the generation to that value when
invalidating.. we should maybe also re-order the lock and bump there?

> 
> fn invalidate_cache_state(cache: &mut ApiTokenSecretCache) {
>      cache.secrets.clear();
>      // clear other cache fields (mtime/len/last_checked) as needed
> }
> 
> fn invalidate_cache_state_and_set_gen(cache: &mut ApiTokenSecretCache, 
> gen: usize) {
>      invalidate_cache_state(cache);
>      cache.shared_gen = gen;
> }
> 
> We could also do a single helper with Option<usize> but two helpers make 
> the call sites more explicit.
> 
> >> +fn invalidate_cache_state(cache: &mut ApiTokenSecretCache) {
> >> +    cache.secrets.clear();
> >> +}
> >> -- 
> >> 2.47.3
> >>
> >>
> >>
> >> _______________________________________________
> >> pbs-devel mailing list
> >> pbs-devel at lists.proxmox.com
> >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> >>
> > 
> > 
> > _______________________________________________
> > pbs-devel mailing list
> > pbs-devel at lists.proxmox.com
> > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 
>