[pbs-devel] [PATCH backup] fix #3336: cleanup when deleting last snapshot

Mon Mar 10 16:22:59 CET 2025

On Mon Mar 10, 2025 at 4:19 PM CET, Fabian Grünbichler wrote:
>
>> Wolfgang Bumiller <w.bumiller at proxmox.com> hat am 10.03.2025 11:53 CET geschrieben:
>>
>>
>> On Fri, Mar 07, 2025 at 04:53:14PM +0100, Shannon Sterz wrote:
>> > On Fri Mar 7, 2025 at 4:33 PM CET, Wolfgang Bumiller wrote:
>> > > On Fri, Mar 07, 2025 at 11:37:32AM +0100, Shannon Sterz wrote:
>> > >> On Thu Mar 6, 2025 at 1:08 PM CET, Maximiliano Sandoval wrote:
>> > >> > When the last snapshot from a group is deleted we clear the entire
>> > >> > group, this in turn cleans the owner for the group.
>> > >> >
>> > >> > Without this change, the user is unable to change the owner of the group
>> > >> > after the last snapshot has been deleted. This would prevent a new
>> > >> > backups to the same group from a different owner.
>> > >> >
>> > >> > Signed-off-by: Maximiliano Sandoval <m.sandoval at proxmox.com>
>> > >> > ---
>> > >> >  src/api2/admin/datastore.rs | 12 +++++++++++-
>> > >> >  1 file changed, 11 insertions(+), 1 deletion(-)
>> > >> >
>> > >> > diff --git a/src/api2/admin/datastore.rs b/src/api2/admin/datastore.rs
>> > >> > index dbb7ae47..305673f1 100644
>> > >> > --- a/src/api2/admin/datastore.rs
>> > >> > +++ b/src/api2/admin/datastore.rs
>> > >> > @@ -423,10 +423,20 @@ pub async fn delete_snapshot(
>> > >> >              &backup_dir.group,
>> > >> >          )?;
>> > >> >
>> > >> > -        let snapshot = datastore.backup_dir(ns, backup_dir)?;
>> > >> > +        let snapshot = datastore.backup_dir(ns.clone(), backup_dir)?;
>> > >> >
>> > >> >          snapshot.destroy(false)?;
>> > >> >
>> > >> > +        let group = BackupGroup::from(snapshot);
>> > >> > +        if group.list_backups().is_ok_and(|backups| backups.is_empty()) {
>> > >> > +            if let Err(err) = datastore.remove_backup_group(&ns, group.as_ref()) {
>> > >> > +                log::error!(
>> > >> > +                    "error while cleaning group {path:?} - {err}",
>> > >> > +                    path = group.full_group_path()
>> > >> > +                );
>> > >> > +            }
>> > >> > +        }
>> > >> > +
>> > >>
>> > >> this bug... looks so harmless, but comes back to haunt me every time. as
>> > >> explained by wobu here [1] it is not really possible to cleanly fix this
>> > >> bug without reworking our locking mechanism. i did send some patches for
>> > >> that (checks notes) almost 3 years ago ^^', but they are still not
>> > >> applied and definitively need a rework at this point [2]. i can pick
>> > >> this up again, sorry got focused on other things in the meantime.
>> > >>
>> > >> [1]: https://lore.proxmox.com/pbs-devel/20220314093617.n2mc2jv4k6ntzroo@wobu-vie.proxmox.com/
>> > >> [2]: https://lore.proxmox.com/pbs-devel/20220824124829.392189-1-s.sterz@proxmox.com/
>> > >
>> > > IIRC we need to figure out a good upgrade strategy so running processes
>> > > don't use different locking.
>> > >
>> > > One idea was to have the postinst script create a file in run (eg
>> > > `/run/proxmox-backup/old-locking`) which, when present, instructs the
>> > > daemons to keep using the old strategy.
>> > >
>> > > The new one would then automatically be used after either a reboot, or
>> > > manually removing the file between stop & start of the daemons.
>> >
>> > yeah i remember that being a blocker, but since pbs 4 is coming up
>> > soon-ish, couldn't we just apply it then? upgrading between 3 -> 4
>> > requiring a reboot seems reasonable to me, though maybe i'm missing
>> > something (e.g. could it be problematic to have the services running,
>> > even shortly, before the reboot?).
>> >
>> > if that would be an option, it'd be much simpler than carrying around
>> > that switching logic forever (or at least one major version?). also,
>>
>> I suppose that could work. Depending on how we want to deal with
>> old-version support there? @Thomas?
>
> I don't think that works, unless we want to require putting all datastores
> into maintenance mode prior to the upgrade and until the system has been
> rebooted?
>
> otherwise, the upgrade from 3.x to 4.x is just like any other, with all the
> same problems if the old proxy still has a backup or other lock-holding task
> running and the new one uses different locking..

i feel like it would be fine to do that though? we already optionally
recommended that when upgrading from 2 -> 3 [1]. so requiring that and
documenting it in the upgrade notes sounds fine to me.

[1]: https://pbs.proxmox.com/wiki/index.php/Upgrade_from_2_to_3#Optional:_Enable_Maintenance_Mode

>
>> > what would happen if a user accidentally creates that file after the new
>> > locking is already in-place? do we consider this "bad luck" or do we
>> > want some kind of protection in-place?
>>
>> For upgrade->downgrade->upgrade switcheroo the postinst script should be
>> able to prevent this, since it gets old and new versions.
>>
>> For anything else I'm putting "accidentally creating that file" in the
>> same category as "accidentally modifying files in their PBS storage" and
>> "accidentally putting their hard drives in the microwave".
>
> agreed on that part :)