[pbs-devel] superseded: [PATCH proxmox{-backup, , -datacenter-manager} v3 00/10] token-shadow: reduce api token verification overhead

Samuel Rufinatscha s.rufinatscha at proxmox.com
Wed Jan 21 16:15:40 CET 2026


https://lore.proxmox.com/pbs-devel/20260121151408.731516-1-s.rufinatscha@proxmox.com/T/#t

On 1/2/26 5:07 PM, Samuel Rufinatscha wrote:
> Hi,
> 
> this series improves the performance of token-based API authentication
> in PBS (pbs-config) and in PDM (underlying proxmox-access-control
> crate), addressing the API token verification hotspot reported in our
> bugtracker #7017 [1].
> 
> When profiling PBS /status endpoint with cargo flamegraph [2],
> token-based authentication showed up as a dominant hotspot via
> proxmox_sys::crypt::verify_crypt_pw. Applying this series removes that
> path from the hot section of the flamegraph. The same performance issue
> was measured [2] for PDM. PDM uses the underlying shared
> proxmox-access-control library for token handling, which is a
> factored out version of the token.shadow handling code from PBS.
> 
> While this series fixes the immediate performance issue both in PBS
> (pbs-config) and in the shared proxmox-access-control crate used by
> PDM, PBS should eventually, ideally be refactored, in a separate
> effort, to use proxmox-access-control for token handling instead of its
> local implementation.
> 
> Problem
> 
> For token-based API requests, both PBS’s pbs-config token.shadow
> handling and PDM proxmox-access-control’s token.shadow handling
> currently:
> 
> 1. read the token.shadow file on each request
> 2. deserialize it into a HashMap<Authid, String>
> 3. run password hash verification via
>     proxmox_sys::crypt::verify_crypt_pw for the provided token secret
> 
> Under load, this results in significant CPU usage spent in repeated
> password hashing for the same token+secret pairs. The attached
> flamegraphs for PBS [2] and PDM [3] show
> proxmox_sys::crypt::verify_crypt_pw dominating the hot path.
> 
> Approach
> 
> The goal is to reduce the cost of token-based authentication preserving
> the existing token handling semantics (including detecting manual edits
> to token.shadow) and be consistent between PBS (pbs-config) and
> PDM (proxmox-access-control). For both sites, this series proposes to:
> 
> 1. Introduce an in-memory cache for verified token secrets and
> invalidate it through a shared ConfigVersionCache generation. Note, a
> shared generation is required to keep privileged and unprivileged
> daemon in sync to avoid caching inconsistencies across processes.
> 2. Invalidate on token.shadow file API changes (set_secret,
> delete_secret)
> 3. Invalidate on direct/manual token.shadow file changes (mtime +
> length)
> 4. Avoid per-request file stat calls using a TTL window
> 
> Testing
> 
> *PBS (pbs-config)*
> 
> To verify the effect in PBS, I:
> 1. Set up test environment based on latest PBS ISO, installed Rust
>     toolchain, cloned proxmox-backup repository to use with cargo
>     flamegraph. Reproduced bug #7017 [1] by profiling the /status
>     endpoint with token-based authentication using cargo flamegraph [2].
> 2. Built PBS with pbs-config patches and re-ran the same workload and
>     profiling setup. Confirmed that
>     proxmox_sys::crypt::verify_crypt_pw path no longer appears in the
>     hot section of the flamegraph. CPU usage is now dominated by TLS
>     overhead.
> 3. Functionally-wise, I verified that:
>     * valid tokens authenticate correctly when used in API requests
>     * invalid secrets are rejected as before
>     * generating a new token secret via dashboard (create token for user,
>     regenerate existing secret) works and authenticates correctly
> 
> *PDM (proxmox-access-control)*
> 
> To verify the effect in PDM, I followed a similar testing approach.
> Instead of PBS’ /status, I profiled the /version endpoint with cargo
> flamegraph [2] and verified that the expensive hashing path disappears
> from the hot section after introducing caching.
> 
> Functionally-wise, I verified that:
>     * valid tokens authenticate correctly when used in API requests
>     * invalid secrets are rejected as before
>     * generating a new token secret via dashboard (create token for user,
>     regenerate existing secret) works and authenticates correctly
> 
> Benchmarks:
> 
> Two different benchmarks have been run to measure caching effects
> and RwLock contention:
> 
> (1) Requests per second for PBS /status endpoint (E2E)
> 
> Benchmarked parallel token auth requests for
> /status?verbose=0 on top of the datastore lookup cache series [4]
> to check throughput impact. With datastores=1, repeat=5000, parallel=16
> this series gives ~172 req/s compared to ~65 req/s without it.
> This is a ~2.6x improvement (and aligns with the ~179 req/s from the
> previous series, which used per-process cache invalidation).
> 
> (2) RwLock contention for token create/delete under heavy load of
> token-authenticated requests
> 
> The previous version of the series compared std::sync::RwLock and
> parking_lot::RwLock contention for token create/delete under heavy
> parallel token-authenticated readers. parking_lot::RwLock has been
> chosen for the added fairness guarantees.
> 
> Patch summary
> 
> pbs-config:
> 
> 0001 – pbs-config: add token.shadow generation to ConfigVersionCache
> Extends ConfigVersionCache to provide a process-shared generation
> number for token.shadow changes.
> 
> 0002 – pbs-config: cache verified API token secrets
> Adds an in-memory cache to cache verified, plain-text API token secrets.
> Cache is invalidated through the process-shared ConfigVersionCache
> generation number. Uses openssl’s memcmp constant-time for matching
> secrets.
> 
> 0003 – pbs-config: invalidate token-secret cache on token.shadow
> changes
> Stats token.shadow mtime and length and clears the cache when the
> file changes, on each token verification request.
> 
> 0004 – pbs-config: add TTL window to token-secret cache
> Introduces a TTL (TOKEN_SECRET_CACHE_TTL_SECS, default 60) for metadata
> checks so that fs::metadata calls are not performed on each request.
> 
> proxmox-access-control:
> 
> 0005 – access-control: extend AccessControlConfig for token.shadow invalidation
> 
> Extends the AccessControlConfig trait with
> token_shadow_cache_generation() and
> increment_token_shadow_cache_generation() for
> proxmox-access-control to get the shared token.shadow generation number
> and bump it on token shadow changes.
> 
> 0006 – access-control: cache verified API token secrets
> Mirrors PBS PATCH 0002.
> 
> 0007 – access-control: invalidate token-secret cache on token.shadow changes
> Mirrors PBS PATCH 0003.
> 
> 0008 – access-control: add TTL window to token-secret cache
> Mirrors PBS PATCH 0004.
> 
> proxmox-datacenter-manager:
> 
> 0009 – pdm-config: add token.shadow generation to ConfigVersionCache
> Extends PDM ConfigVersionCache and implements
> token_shadow_cache_generation() and
> increment_token_shadow_cache_generation() from AccessControlConfig for
> PDM.
> 
> 0010 – docs: document API token-cache TTL effects
> Documents the effects of the TTL window on token.shadow edits
> 
> Changes from v1 to v2:
> 
> * (refactor) Switched cache initialization to LazyLock
> * (perf) Use parking_lot::RwLock and best-effort cache access on the
>    read/refresh path (try_read/try_write) to avoid lock contention
> * (doc) Document TTL-delayed effect of manual token.shadow edits
> * (fix) Add generation guards (API_MUTATION_GENERATION +
>    FILE_GENERATION) to prevent caching across concurrent set/delete and
>    external edits
> 
> Changes from v2 to v3:
> 
> * (refactor) Replace PBS per-process cache invalidation with a
>    cross-process token.shadow generation based on PBS
>    ConfigVersionCache, ensuring cache consistency between privileged
>    and unprivileged daemons.
> * (refactor) Decoupling generation source from the
>    proxmox/proxmox-access-control cache implementation: extend
>    AccessControlConfig hooks so that products can provide the shared
>    token.shadow generation source.
> * (refactor) Extend PDM's ConfigVersionCache with
>    token_shadow_generation
>    and introduce a pdm_config::AccessControlConfig wrapper implementing
>    the new proxmox-access-control trait hooks. Switch server and CLI
>    initialization to use pdm_config::AccessControlConfig instead of
>    pdm_api_types::AccessControlConfig.
> * (refactor) Adapt generation checks around cached-secret comparison to
>    use the new shared generation source.
> * (fix/logic) cache_try_insert_secret: Update the local cache
>    generation if stale, allowing the new secret to be inserted
>    immediately
> * (refactor) Extract cache invalidation logic into a
>    invalidate_cache_state helper to reduce duplication and ensure
>    consistent state resets
> * (refactor) Simplify refresh_cache_if_file_changed: handle the
>    un-initialized/reset state and adjust the generation mismatch
>    path to ensure file metadata is always re-read.
> * (doc) Clarify TTL-delayed effects of manual token.shadow edits.
> 
> Please see the patch specific changelogs for more details.
> 
> Thanks for considering this patch series, I look forward to your
> feedback.
> 
> Best,
> Samuel Rufinatscha
> 
> [1] https://bugzilla.proxmox.com/show_bug.cgi?id=7017
> [2] attachment 1767 [1]: Flamegraph showing the proxmox_sys::crypt::verify_crypt_pw stack
> [3] attachment 1794 [1]: Flamegraph PDM baseline
> [4] https://bugzilla.proxmox.com/show_bug.cgi?id=6049
> 
> proxmox-backup:
> 
> Samuel Rufinatscha (4):
>    pbs-config: add token.shadow generation to ConfigVersionCache
>    pbs-config: cache verified API token secrets
>    pbs-config: invalidate token-secret cache on token.shadow changes
>    pbs-config: add TTL window to token secret cache
> 
>   Cargo.toml                             |   1 +
>   docs/user-management.rst               |   4 +
>   pbs-config/Cargo.toml                  |   1 +
>   pbs-config/src/config_version_cache.rs |  18 ++
>   pbs-config/src/token_shadow.rs         | 298 ++++++++++++++++++++++++-
>   5 files changed, 321 insertions(+), 1 deletion(-)
> 
> 
> proxmox:
> 
> Samuel Rufinatscha (4):
>    proxmox-access-control: extend AccessControlConfig for token.shadow
>      invalidation
>    proxmox-access-control: cache verified API token secrets
>    proxmox-access-control: invalidate token-secret cache on token.shadow
>      changes
>    proxmox-access-control: add TTL window to token secret cache
> 
>   Cargo.toml                                 |   1 +
>   proxmox-access-control/Cargo.toml          |   1 +
>   proxmox-access-control/src/init.rs         |  17 ++
>   proxmox-access-control/src/token_shadow.rs | 299 ++++++++++++++++++++-
>   4 files changed, 317 insertions(+), 1 deletion(-)
> 
> 
> proxmox-datacenter-manager:
> 
> Samuel Rufinatscha (2):
>    pdm-config: implement token.shadow generation
>    docs: document API token-cache TTL effects
> 
>   cli/admin/src/main.rs                       |  2 +-
>   docs/access-control.rst                     |  4 ++
>   lib/pdm-config/Cargo.toml                   |  1 +
>   lib/pdm-config/src/access_control_config.rs | 73 +++++++++++++++++++++
>   lib/pdm-config/src/config_version_cache.rs  | 18 +++++
>   lib/pdm-config/src/lib.rs                   |  2 +
>   server/src/acl.rs                           |  3 +-
>   7 files changed, 100 insertions(+), 3 deletions(-)
>   create mode 100644 lib/pdm-config/src/access_control_config.rs
> 
> 
> Summary over all repositories:
>    16 files changed, 738 insertions(+), 5 deletions(-)
> 





More information about the pbs-devel mailing list