[pve-devel] [RFC storage 10/23] plugin: introduce new_backup_provider() method

Thu Jul 25 15:11:25 CEST 2024

Am 25.07.24 um 11:48 schrieb Max Carrara:
> On Tue Jul 23, 2024 at 11:56 AM CEST, Fiona Ebner wrote:
>> Signed-off-by: Fiona Ebner <f.ebner at proxmox.com>
> 
> Some overall thoughts:
> 
> 1.  I'm really, really happy to see documentation in this module here,
>     that's fantastic! :)
> 
>     While the contents of the docs seem fine, I would suggest you used
>     POD instead. You can find an example in one of my recent series. [1]
>     I mainly prefer POD solely because it's what Perl uses; it also
>     indirectly makes sure we all use the same kind of format for
>     documenting our Perl code.
> 
>     Of course, we've currently not decided on any particular format, but
>     because the opportunity arose, I wanted to pitch POD here
>     nevertheless. ;)
> 

I'll look into it for v2. Agreed, following a standard for documenting
an API module has its merits.

> 2.  I would personally prefer a namespace like `PVE::Backup::Provider`
>     instead of `PVE::BackupProvider`, simply because it leaves room for
>     further packages and reduces churn in the long term, IMO.
> 

There's a risk though that PVE::Backup::Provider and PVE::Backup::Foo
are unrelated things that have no real business sharing a namespace.

>     The same goes for backup provider plugins - IMO namespacing them
>     like e.g. `PVE::Backup::Provider::Plugin::Foo` where `Foo` is a
>     (concrete) plugin.
> 

The BackupProvider namespace is already intended for the plugins, adding
an extra level with "Plugin" would just bloat the module names,
especially if we decide to go the same route as for storage plugins and
have a "Custom" sub-namespace.

>     While this seems long or somewhat excessive, I think it enforces a
>     clear package / module hierarchy and keeps things tidier in the long
>     term, and those couple extra keystrokes don't really hurt anyone.
> 

I get where you're coming from, I just feel like BackupProvider might be
better as its own separate thing, containing the plugins for the
specific purpose. But I don't have a strong opinion about it, and am
fine making such changes if other developers prefer it too :)

> The above two methods - `backup_nbd` and `backup_directory` - is there
> perhaps a way to merge them? I'm not sure if what I'm having in mind
> here is actually feasible, but what I mean is "making the method
> agnostic to the type of backup". As in, perhaps pass a hash that
> contains a `type` key for the type of backup being made, and instead of
> having long method signatures, include the remaining parameters as the
> remaining keys. For example:
> 
> {
>     'type' => 'lxc-dir',  # type names are just examples here
>     'directory' => '/foo/bar/baz',
>     'bandwidth_limit' => 42,
>     ...
> }
> 
> {
>     'type' => 'vm-nbd',
>     'device_name' => '...',
>     'nbd_path' => '...',
>     ...
> }
> 
> You get the point :P
> 
> IMO it would make it easier to extend later, and also make it more
> straightforward to introduce new parameters / deprecate old ones, while
> the method signature stays stable otherwise.
> 
> The same goes for the different cleanup methods further down below;
> instead of having a separate method for each "type of cleanup being
> performed", let the implementor handle it according to the data the
> method receives.
> 
> IMHO I think it's best to be completely agnostic over VM / LXC backups
> (and their specific types) wherever possible and let the data describe
> what's going on instead.
> 

The point about extensibility is a good one. The API wouldn't need to
change even if we implement new mechanisms. But thinking about it some
more, is there anything really gained? Because we will not force plugins
to implement the methods for new mechanisms of course, they can just
continue supporting what they support. Each mechanism will have its own
specific set of parameters, so throwing everything into a catch-all
method and hash might make it too generic.

Or think about the documentation for the single backup method: it would
become super lengthy and describe all backup mechanisms, while a plugin
most likely only cares about a single one and would have an easier time
with a method that captures that mechanism's parameters explicitly.
Won't the end result be making the implementors life slightly harder,
because it first needs to extract the parameters for the specific mechanism?

> For the specific types we can always then provide helper functions that
> handle common cases that implementors can use.
> 
> Extending on my namespace idea above, those helpers could then land in
> e.g. `PVE::Backup::Provider::Common`, `PVE::Backup::Provider::Common::LXC`,
> etc.
> 

Could you give an example for such a helper? I rather feel like things
like libnbd will be that, i.e. for accessing the NBD export, not sure if
we can create common helpers that would benefit multiple providers, each
has their own backend they'll need to talk to after all and might have
quite different needs.