[pve-devel] [PATCH storage] fix #4289: pbs: wait for backup verification to finish before updating volume attribute

Christoph Heiss c.heiss at proxmox.com
Tue Jan 10 12:11:41 CET 2023


Thanks for the review!

On Wed, Jan 04, 2023 at 11:50:38AM +0100, Fiona Ebner wrote:
> Am 02.01.23 um 13:36 schrieb Christoph Heiss:
> > diff --git a/PVE/Storage/PBSPlugin.pm b/PVE/Storage/PBSPlugin.pm
> > index 4320974..1cdbc11 100644
> > --- a/PVE/Storage/PBSPlugin.pm
> > +++ b/PVE/Storage/PBSPlugin.pm
> > @@ -906,8 +906,30 @@ sub get_volume_attribute {
> >      return;
> >  }
> >
> > +sub wait_for_verify_finish {
> > +    my ($conn, $node, $datastore, $attrs) = @_;
> > +
> > +    my $param = {
> > +	running => 'true',
> > +	since => $attrs->{'backup-time'},
> > +	store => $datastore,
> > +	typefilter => 'verify',
> > +    };
> > +
> > +    my $taskname = sprintf('%s:%s/%s/%X',
> > +	$datastore,
> > +        @{$attrs}{qw(backup-type backup-id backup-time)},
> > +    );
>
> I don't think it's likely that the task name format here will change
> often, but as you already mentioned in the cover letter, it's not ideal
> to have it hard-coded here.
>
> > +
> > +    while (1) {
> > +	my $res = eval { $conn->get("/api2/json/nodes/$node/tasks", $param); };
> > +	last if !grep { $_->{worker_id} eq $taskname } @$res;
> > +	sleep(1);
> > +    }
> > +}
> > +
> > @@ -921,6 +943,9 @@ sub update_volume_attribute {
> >  	my $conn = pbs_api_connect($scfg, $password);
> >  	my $datastore = $scfg->{datastore};
> >
> > +	$logfunc->('info', 'waiting for server to finish backup verification...') if $logfunc;
>
> Should only be printed if there is actually a verification we need to
> wait for.
Makes sense.

>
> > +	wait_for_verify_finish($conn, $scfg->{server}, $datastore, $param);
>
> To me, it feels out of place to be concerned with waiting on
> verification in (the rather low-level) update_volume_attribute(), which
> is a rather specific thing to do. I'd say it's fine to fail there when
> the snapshot is locked by verification or some other operation.
>
> Waiting for verification also can increase the backup duration/time
> holding the vzdump lock on the PVE side quite a bit.
That was one of my concerns too. Especially for very big VMs this can
probably delay the task quite a bit.

> It might not seem that big of a deal, because usually only manual
> backups use 'protected'.  But by doing it in
> update_volume_attribute(), you also do it for 'notes', where it's not
> needed and which is relevant to backup jobs where the increased wait
> might be very noticeable. So at least, it should only be done for
> 'protected' if doing it in update_volume_attribute().
That is actually the case now - updating notes takes a different path
through update_volume_notes().

>
> It would be better if the protected flag could be specified upon
> creation already. Would also fix the following race I guess:
It definitely would be a lot cleaner. I'll see what I can do and rework
the whole series.
Probably involves adding a new parameter to the `proxmox-backup-client
backup` command and API(?) AFAICS. But this would not be all that bad
of a feature for the backup client in general, I think.

And I guess I need to figure out a way how to detect whether the new
parameter is supported or not?
In case this it not supported, just keeping the current behavior (i.e.
best-effort via the API and maybe failing) is probably the sensible way.

> 1. backup finishes
> 2. prune running on PBS
> 3. protected status set from PVE
>
> If going for the waiting approach after all, I think it should rather be
> done in vzdump, before calling update_volume_attribute(). And the helper
> to wait on verification should likely be part of PBSClient.pm (would
> need to teach it to use an API connection first).






More information about the pve-devel mailing list