[pbs-devel] RFC: Scheduler for PBS

Christian Ebner c.ebner at proxmox.com
Fri Aug 9 13:22:31 CEST 2024


> On 09.08.2024 11:31 CEST Max Carrara <m.carrara at proxmox.com> wrote:
> Architectural Overview
> ----------------------
> 
> The scheduler internally contains the type of job queue that is being
> used, which in our case is a simple FIFO queue. We also used HTTP
> long-polling [3] to schedule backup jobs, responding to the client only
> when the backup job is started.
> 
> While long-polling appears to work fine for our current intents and
> purposes, we still want to test if any alternatives (e.g.
> "short-polling", as in normal polling) are more robust.
> 
> The main way to communicate with the scheduler is via its event loop.
> This is a plain tokio task with an inner `loop` that matches on an enum
> representing the different events / messages the scheduler may handle.
> Such an event would be e.g. `NewBackupRequest` or `ConfigUpdate`.
> 
> The event loop receives events via an mpsc channel and may respond to
> them individually via oneshot channels which are set up when certain
> events are created. The benefit of tokio's channels is that they can
> also work in blocking contexts, so it is possible to completely isolate
> the scheduler in a separate thread if needed, for example.
> 
> Because users should also be able to dynamically configure the
> scheduler, configuration changes are handled via the `ConfigUpdate`
> event. That way even the type of the queue can be changed on the fly,
> which one prototype is able to do.
> 
> Furthermore, our prototypes currently run inside `proxmox-backup-proxy`
> and are reasonably decoupled from the rest of PBS, due to the scheduler
> being event-based.

Thanks for the write-up, this does sound interesting!

Do you plan to also include the notification system, e.g. by sending out notification events based on events/messages handled by the scheduler? Or will that solely be handled by the worker tasks?

What about periodic tasks that should be run at a given time, e.g. for server side alerts/monitoring tasks [0]? From you description I suppose these would simply be a different job type, and therefore be queued/executed based on their priority?

Can you already share some code (maybe of one of the prototypes), so one can have a closer look and do some initial testing or is it still to experimental for that?

Cheers,
Chris

[0] https://bugzilla.proxmox.com/show_bug.cgi?id=5108




More information about the pbs-devel mailing list