[pdm-devel] [PATCH proxmox-datacenter-manager 10/25] metric collection: collect overdue metrics on startup/timer change
Wolfgang Bumiller
w.bumiller at proxmox.com
Thu Feb 13 15:19:31 CET 2025
On Thu, Feb 13, 2025 at 02:50:32PM +0100, Lukas Wagner wrote:
>
>
> On 2025-02-13 09:55, Wolfgang Bumiller wrote:
> >> loop {
> >> let old_settings = self.settings.clone();
> >> tokio::select! {
> >> @@ -124,7 +132,12 @@ impl MetricCollectionTask {
> >> "metric collection interval changed to {} seconds, reloading timer",
> >> interval
> >> );
> >> - timer = Self::setup_timer(interval);
> >> + (timer, next_run) = Self::setup_timer(interval);
> >> + // If change (and therefore reset) our timer right before it fires,
> >> + // we could potentially miss one collection event.
> >
> > Couldn't we instead just pass `next_run` through to `setup_timer` and
> > call `reset_at(next_run)` on it? (`first_run` would only be used in the
> > initial setup, so `next_run` could either be an `Option`, or the setup
> > code does the `next_aligned_instant` call...
> >
> > This should be much less code by making the new
> > `fetch_overdue{,_and_save_sate}()` functions unnecessary, or am I
> > missing something?
> >
>
> I guess the question is, do we want nicely aligned timer ticks?
>
> e.g. 14:01:00, 14:02:00, 14:03:00 ... for 60 second interval
> or 14:00:00, 14:05:00, 14:10:00 ... for a 5 minute interval?
>
> Because that was the main intention behind using the 'collection-interval' as
> a base for calculating the aligned instant for the first timer reset.
> If we reuse the 'old' `next_run` when the interval is changed, we
> also reuse the old alignment.
>
> For instance, when changing from initially 1 minute to 5 minutes, the
> timer ticks might come at
> 14:01:00, 14:06:00, 14:11:00
>
> Technically, the naming for the `next_run` variable is not the best,
> since it just contains the Instant when the timer *first* fires, but
> this is then never updated to the *next* time the timer will fire...
> So that means that when changing the interval with your suggested change,
> you'd pass an Instant to `reset_at` that is already in the past,
> causing the timer to fire immediately.
>
> If we *don't* care about the aligned ticks as described above, we could
> just use a static alignment boundary, e.g. 60 seconds.
> In this case we can also get rid of the fetch_overdue stuff, since
> at worst case we have 60 seconds until the next tick on startup or timer change,
> which should be good enough to prevent any significant gaps in the data.
What about setting a flag - if the current next tick was earlier than
the new next tick - to tell tick() to re-align the timer when it is next
triggered?
So when going from 1 to 5 minutes at 14:01:50, we `.reset_at(14:02:00)`
and also set `realign = true`, and at 14:02, tick() should
`.reset_at(14:05:00)`.
I just feel like the logic in the "fetch_overdue" code should not be
necessary to have, but if it's too awkward to handle via the tick timer,
it's fine to keep it in v2.
More information about the pdm-devel
mailing list