[pbs-devel] [PATCH proxmox-backup 1/4] api2: make remote for sync-jobs optional

Thu Feb 16 09:02:26 CET 2023

On February 15, 2023 12:40 pm, Thomas Lamprecht wrote:
> Am 14/02/2023 um 15:33 schrieb Fabian Grünbichler:
>> On February 13, 2023 4:45 pm, Hannes Laimer wrote:
>>> ... and update places where it is used.
>>> A SyncJob not having a remote means it is pulling
>>> from a local datastore.
>> high level: I wonder whether we really need this for sync jobs, or whether just
>> having it for pull (or as a new API/CLI endpoint copy/move?) would be enough as
>> a start? is there a use case for scheduled local syncing?
>>  
> 
> Yes, e.g. existing ones could be: having a small and fast "incoming" datastore,
> which avoids blocking guests on backups and has the "hot" set of snapshots (most
> recent) available while using a slower, but huge second one for long term archival.

yeah, that one makes sense.

> Future ones would be sync to a S3 backed object storage, which we probably only
> want to have done from existing data (similar to tape), but still avoid the media
> catalogue and labelling overhead tape must have to be really useful.

not sure - we'd probably also want to combine chunks into bigger objects for S3
to save costs? but that is something we can evaluate when we start designing
that feature in detail.

> Another future one is removable datastores, which this is upfront work for. While
> we might not always have time trigged event there, its still useful to have use a
> sync job for, e.g., hot-plug triggered events.

could be implemented "inline", but yeah, having a list of "jobs to trigger" is
nicer and more flexible.

why I asked the question is the following:
- PullParameters is an internal implementaion detail, and can be refactored like
we want
- SyncJobConfig is not - the naming of fields makes less sense now with
non-remote usage, we need to store an additional user there if we want
unprivileged local sync, and upgrading config files in place when it's not just
adding a new, optional field is yucky

so I guess we could
- live with the ugly config file/API/CLI parameters having remote optional for
the local case, job_owner/.. optional for the remote case, and remote_ns and
remote_store fields that are actually local for the local case (that's just what
came up so far, maybe more)
- split local and remote sync jobs (or copy and sync, or ..) into two different
configs so that each just has the fields it actually needs with names that make
sense - but also kinda meh

> Besides that, I'm a bit reserved against adding a move that can cross datastore
> boundaries, as doing that manually seems not that useful for any but the smallest
> PBS instances (especially on the snapshot level) and for others a sync + prune
> is normally better anyway. Moving groups and namespaces around in the same datastore
> OTOH would be useful for organizing purpose, and without crossing into another CAS
> also simple to implement.

yeah, move/copy would just be "alternative" endpoints for local pulling that
re-use the pull code under the hood, but expose a better set of API
parameters/terminology. "move" could definitely restricted to intra-datastore
operations, if we implement it.