[pbs-devel] [PATCH proxmox-backup 3/5] garbage collection: add structure for optimized image iteration

Fri Mar 7 09:53:15 CET 2025

> Christian Ebner <c.ebner at proxmox.com> hat am 07.03.2025 09:24 CET geschrieben:
> 
>  
> On 3/5/25 14:47, Fabian Grünbichler wrote:
> > On February 21, 2025 3:01 pm, Christian Ebner wrote:
> >> Implements the GroupedImageList struct and methods, which groups
> >> index files (image) paths by a hierarchy for optimized iteration
> >> during phase 1 of garbage collection.
> >>
> >> Currently, phase 1 of garbage collection iterates over all folders in
> >> the datastore, without considering any logical organization. This is
> >> to avoid missing image indices which might have unexpected paths,
> >> thereby deleting chunks which are still in use by these indices in GC
> >> phase 2.
> >>
> >> The new structure helps to iterate over the index files in a more
> >> logical way, without missing strange paths. The hierarchical
> >> organization helps to avoid touching shared chunks of incremental
> >> snapshot backups in a backup group multiple times, by allowing
> >> tracking of these without excessive memory requirements.
> >>
> >> Since deduplication happens on a per image basis for subsequent
> >> snapshots, the hierarchy is chosen as follows:
> >> - ns/group
> >> - image filename
> >> - snapshot timestamp
> >>
> >> This allows to iterate over consecutive snapshots for the same images
> >> in the same backup namespace and group.
> >>
> >> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
> >> ---
> >>   pbs-datastore/src/datastore.rs | 63 ++++++++++++++++++++++++++++++++++
> >>   1 file changed, 63 insertions(+)
> >>
> >> diff --git a/pbs-datastore/src/datastore.rs b/pbs-datastore/src/datastore.rs
> >> index eda78193d..520f54548 100644
> >> --- a/pbs-datastore/src/datastore.rs
> >> +++ b/pbs-datastore/src/datastore.rs
> >> @@ -1,4 +1,5 @@
> >>   use std::collections::{HashMap, HashSet};
> >> +use std::ffi::OsString;
> >>   use std::io::{self, Write};
> >>   use std::os::unix::ffi::OsStrExt;
> >>   use std::os::unix::io::AsRawFd;
> >> @@ -1573,3 +1574,65 @@ impl DataStore {
> >>           Ok(())
> >>       }
> >>   }
> >> +
> >> +struct GroupedImageList {
> >> +    groups: HashMap<String, HashMap<OsString, Vec<(i64, PathBuf)>>>,
> > 
> > this seems very complicated, couldn't we make it a lot simpler by doing
> > 
> > key: NS + Group (as tuple, using the actual types)
> > value: (snapshot => index paths)
> > 
> > or even a nested mapping of
> > 
> > NS -> Group -> Snapshot -> Images
> > 
> > and then simply reset when proceeding from one snapshot to the next? the
> > scope of "in-memory chunk digests" would still be bounded (in a way that
> > is in line with how we do it in many other parts, like when doing backup
> > + restore), and the structure of this helper entity and the way it is
> > iterated over would feel much more natural.
> 
> The type signature looks more complex than what it is in the end, but I 
> do agree that this can be reduced/optimized.

it's not so much the type itself, but the elements (too much String-like
types where we could have proper ones) and the way of grouping things ;)

in the end, it is either a nested map or one with a complex key, no way
around it (well, other than the alternate approach below :))

> > 
> > we could go one step further, but that might eat some of our performance
> > gains here
> > 
> > - list all images like we did before and store the result in a
> >    collection that allows fast removal
> > - iterate normally over the datastore in a structured fashion using the
> >    existing helpers
> > -- when proceeding from one snapshot to the next, use the new "don't
> > retouch chunks" logic
> > -- remove each visited image path from the list of images
> > - if any images are left at the end in that list, mark those manually
> >    (strange or vanished paths, should hopefully be empty or irrelevant)
> > 
> > the main downside is that we'd have to iterate twice (well, not quite
> > twice, since the hierarchical iterators skip parts outside of the
> > "known" hierarchy), but we would save all this custom parse-back logic
> > here. if we keep the parse-back logic, then I think we want to have a
> > logical structure as well that follows how we normally do things.
> > 
> > I think that the list_images part of GC is by far the least expensive
> > though (for my test datastore where the GC runtime goes down from 52 to
> > 17s with your patch series, listing images is 12ms of that ;)), so
> > effectively doing it twice might not hurt that much in practice..
> 
> I do like this approach more, as this will also detect snapshots as 
> "strange paths" when the helpers themself do fail to detect an index 
> file for some reason, not just the snapshot being off.
> 
> So I will reiterate the patches using this approach, thanks!

perf numbers for bigger datastores and the usual pathological candidates
(NFS/Samba, single spinning rust) would be nice in that case - I guess
that with iterating twice we should still end up being faster, as the
reduced chunk accesses should by far outweigh the metadata accesses for
listing (unless you only have a single snapshot per group I guess), but
hard numbers are better than gut feelings :) both duration and number of
file operations might be interesting (maybe for stock, this version and
the next one?)