[pbs-devel] applied: [RFC PATCH proxmox-backup 2/2] file-restore: dynamically increase memory of vm for zpools

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Nov 8 10:19:20 CET 2022


Am 08/11/2022 um 08:59 schrieb Dominik Csapak:
> On 11/7/22 18:15, Thomas Lamprecht wrote:
>> Am 07/11/2022 um 13:35 schrieb Wolfgang Bumiller:
>>> applied
>>
>> meh, can we please get this opt-in any only enabled it for root at pam or for users
>> with some powerfull priv on / as talked as chosen approach to allow more memory the
>> last time this came up (off list IIRC)... I really do *not* want a memory DOS potential
>> increased by a lot just opening some file-restore tabs, this actually should get more
>> restrictions (for common "non powerfull" users), not less..
> 
> understandable, so i can do that, but maybe it's time we rethink the file-restore
> mechanism as a whole, since it's currently rather inergonomic:

IMO the current approach is very ergonomic as the user doesn't has to care
at all about details, I'd like to keep it that way as much as possible.
While we may need to add some every extra setting or extra work load for users/admins
should be kept at a minimum.

> 
> * users don't know how many and which file restore vms are running, they
> may not even know it starts a vm at all

that isn't an issue, the user wants file restores, not care about managing
its details.

> * regardless with/without my patch, the only thing necessary to start a bunch vms
> is VM.Backup to the vm and Datastore.AllocateSpace on the storage
> (which in turn is probably enough to create an arbitrary amount of backups)
> * having arbitrary sized disks/fs inside, no fixed amount we give will always
> be enough
> 
> so here some proposals on how to improve that (we won't implement all of them
> simultaneously, but maybe something from that list is usable)
> * make the list of running file-restore vms visible, and maybe add a manual 'shutdown'

the VM lives for a minute and can be spawn in milliseconds, so not really seeing
the benefit of that, a misguided/bad/... user can much faster spawn such VMs that
one can stop them and the syslog already contains basic info.

> * limit the amount of restore vms per user (or per vm?)
>   - this would need the mechanism from above anyway, since otherwise either the user
>     cannot start the restore vm or we abort an older vm (with possibly running
>     download operation)

which mechanism? The list of running VMs is only required in the backend
and we can already find that info out there.

> * make the vm memory configurable (per user/vm/globally?)

meh, but will be one of the better mechanisms with the priv/unpriv and
different limits enforced via a cgroup (see below)

> * limit the global memory usage for file restore vms
>   - again needs some control mechanism for stopping/seeing these vms

Not really, just needs a cgroup for limiting all. Unpriv user A really
won't be able to stop the file-restore VM of unpriv User B, so this is not
really gaining anything - you cannot expect that an admin is around all
the time.

> * throw away the automatic starting of vms, and make it explicit, i.e.

NACK, worsens UX a lot without really a benefit, the VM will still be
started in the end, whatever limiting mechanism could be applied without
any dialogue anyway..

>   make the user start/shutdown a vm manually
>   - we can have some 'configuration panel' before starting (like with restore)
>   - the user is aware it's starting
>   - still needs some mechanism to see them, but with a manual starting api call
>     it's easier to have e.g. a running worker that can be stopped
> 


As said weeks ago, the simple stop gap for now is to just opt into more possible
memory once we got a sufficiently privilege (e.g., VM.Allocate and/or Sys.Modify
on / as a starter) for a user, this does away the basic woes now. Sure, a priv
and unpriv user may get to "open" the same restore VM, and depending on order
it will then be able to use more or less resources, but that won't matter much
in practice and the same is true for the debug flag.

As small addition, with possibly nice in practice effects: 
add a "Finish" button  to the file-restore window, that actively sends a signal
to the VM on press which then will reduce the idle shutdown timer to ~3s (to avoid
breaking other currently restores or having that open), that way you don't need
extra lists and managements as most of the time it happens indirectly just fine.

Additionally we can put all VMs in a cgroup with max mem configured, that really
could be a new setting in the node or datacenter.cfg, can easily be nested as in:

- root restore CG with a total limit, restores started by priv'd user start here
 '- user A CG with a per-unpriv-user-limit
 '- user B CG with a per-unpriv-user-limit

That way we would not need to limit on number of restores, which we don't care
but the actual resource we care about. Side benefit, could get reduced CPU shares
so that scheduling prefers the "real" workload.

Another possibility would be to also evaluate the MicroVM machine type to reduce
the basic impact of the OS - tbh. I'm not sure if Stefan checked that out.

And please note that this all is mostly an artificial problem anyway, ZFS isn't
that popular inside of guests, which are the prime target for file-restore,
and especially not in form of using a huge set of disks as redundancy et al. is
normally covered more centrally on the host, so iff its rather used for snapshots
(which lvm-thin or btrfs may be a better option inside guests, as they're in-
kernel-tree and one doesn't pay such a (relatively) big memory tax.

IIRC the issue even only got reported from internal testing of PVE or the like,
iow. odd setups. So please lets keep this simple and avoid breaking UX for all
the actual realistic uses cases.

So please, make this opt-in at PBS site first, then, opt-in for the respective
user class (even just root at pam can be argued if really unsure), else I'll need
to revert this for the next bump.





More information about the pbs-devel mailing list