[pbs-devel] [PATCH v4 proxmox-backup 0/5] fix #5331: GC: avoid multiple atime updates
Christian Ebner
c.ebner at proxmox.com
Fri Mar 21 10:31:57 CET 2025
This patches implement the logic to greatly improve the performance
of phase 1 garbage collection by avoiding multiple atime updates on
the same chunk.
Currently, phase 1 GC iterates over all folders in the datastore
looking and collecting all image index files without taking any
logical assumptions (e.g. namespaces, groups, snapshots, ...). This
is to avoid accidentally missing image index files located in
unexpected paths and therefore not marking their chunks as in use,
leading to potential data losses.
This patches improve phase 1 by:
- Iterating index images using the datatstore's iterators for detecting
regular index files. Paths outside of the iterator logic are still taken
into account and processed as well by generating a list of all the found
images first, removing index files encountered while iterating, finally
leaving a list of indexes with unexpected paths. These unexpected paths
are now also logged, for the user to potentially take action.
- Keeping track of recently touched chunks by storing their digests in a
LRU cache, skipping over expensive atime updates for chunks already
present in the cache.
Most notably changes since version 3 (thanks Wolfgang for feedback):
- Use `with_context` over `context` to avoid possibly unnecessary allocation
- Align terminology with docs and rest of the codebase by using index
file instead of image in method and variable names.
Most notably changes since version 2 (thanks Fabian for feedback):
- Use LRU cache instead of keeping track of chunks from the previous
snapshot in the group.
- Split patches to logically separate iteration from caching logic
- Adapt for better anyhow context error propagation and formatting
Most notably changes since version 1 (thanks Fabian for feedback):
- Logically iterate using pre-existing iterators instead of constructing
data structure for iteration when listing images.
- Tested that double listing does not affect runtime.
- Chunks are now remembered for all archives per snapshot, not just a
single archive per snapshot as previously, this mimics more closely
the backup behaviour, this give some additional gains in some cases.
Below are still data from version 3, as no logical changes affecting runtime
were made in version 4 of the patches.
Statistics generated by averaging 3 GC runtimes, measured after an initial
run each to warm up caches. Datastores A (1515 index files) and B (192
index files) are unrelated, containing "real" backups.
The syscall counts were generated using
`strace -f -e utimensat -p $(pidof proxmox-backup-proxy) 2&> dump.trace` and
(after small cleanup) `wc -l dump.trace`.
datastore A on spinning disk:
unpatched: 115.3s ± 0.5 s, utimensat calls: 6864667
version 2: 52.0s ± 0.0 s, utimensat calls: 1151734
version 3: 24.6s ± 0.5 s, utimensat calls: 1079149
datastore B on SSD:
unpatched: 44.6 ± 3.2 s, utimensat calls: 2034949
version 2: 15.0 ± 1.0 s, utimensat calls: 562929
version 3: 14.6 ± 0.5 s, utimensat calls: 559053
Christian Ebner (5):
tools: lru cache: tell if node was already present or newly inserted
garbage collection: format error including anyhow error context
datastore: add helper method to open index reader from path
garbage collection: generate index file list via datastore iterators
fix #5331: garbage collection: avoid multiple chunk atime updates
pbs-datastore/src/datastore.rs | 188 +++++++++++++++++++++++---------
pbs-tools/src/lru_cache.rs | 4 +-
src/api2/admin/datastore.rs | 6 +-
src/bin/proxmox-backup-proxy.rs | 2 +-
4 files changed, 140 insertions(+), 60 deletions(-)
--
2.39.5
More information about the pbs-devel
mailing list