[pbs-devel] [PATCH proxmox-backup v2 1/3] fix #3786: api: add resync-corrupt option to sync jobs

Fabian Grünbichler f.gruenbichler at proxmox.com
Tue Nov 5 08:20:14 CET 2024


> Gabriel Goller <g.goller at proxmox.com> hat am 04.11.2024 17:02 CET geschrieben:
> 
>  
> On 04.11.2024 12:51, Fabian Grünbichler wrote:
> >this doesn't really do what it says on the tin, see below.
> >
> >On October 18, 2024 11:09 am, Gabriel Goller wrote:
> >> @@ -528,6 +554,10 @@ async fn pull_group(
> >>          .enumerate()
> >>          .filter(|&(pos, ref dir)| {
> >>              source_snapshots.insert(dir.time);
> >> +            // If resync_corrupt is set, we go through all the remote snapshots
> >> +            if params.resync_corrupt {
> >> +                return true;
> >> +            }
> >
> >alternatively, we could check the local manifest here, and only include
> >existing snapshots with a failed verification state, the last one and
> >new ones? that way, we'd get more meaningful progress stats as well..
> 
> That's true, that would be cleaner. The downside is that we would have
> open/parse the BackupManifest twice.
> 
> I could write something like:
> 
>      if params.resync_corrupt {
>         let local_dir = params.target.store.backup_dir(target_ns.clone(), dir.clone());
>         if let Ok(dir) = local_dir {
>            let verify_state = dir.verify_state(); 
>            if verify_state == Some(VerifyState::Failed) {
>                return true;
>            }
>         }
>      }
> 
> >because right now, this will not only resync existing corrupt snapshots,
> >but also ones that have been pruned locally, but not on the source
> >(i.e., the other proposed "fixing" sync mode that syncs "missing"
> >old snapshots, not just corrupt ones).
> 
> I'm too stupid to find the mail where this was mentioned/discussed, I'm
> quite sure we said to just pull both, and then maybe separate them in a
> future iteration/feature. But now that I think about the flag is named
> `resync_corrupt` so I'd expect it to only pull in the corrupt snapshots.
> 
> I actually agree with this change, it probably is also more performant
> (reading backup_manifest twice is probably faster than pulling lots of
> unneeded manifests from the remote).

I think we do want both of these, but they are not a single feature, since the "sync missing snapshots" option would effectively undo any pruning you do on the target side. Their implementation is of course somewhat intertwined, as they both affect the snapshot selection logic. They might even be both enabled and combined with remove_vanished to have a sort of 1:1 repairing replication going on (with all the pros and cons/danger that comes with it).

For syncing missing snapshots we might also want to require additional privileges to prevent lesser privileged users from stuffing backup groups (i.e, at least require Prune privs?).




More information about the pbs-devel mailing list