[pdm-devel] [PATCH proxmox-datacenter-manager 10/25] metric collection: collect overdue metrics on startup/timer change

Lukas Wagner l.wagner at proxmox.com
Thu Feb 13 14:50:32 CET 2025



On  2025-02-13 09:55, Wolfgang Bumiller wrote:
>>          loop {
>>              let old_settings = self.settings.clone();
>>              tokio::select! {
>> @@ -124,7 +132,12 @@ impl MetricCollectionTask {
>>                      "metric collection interval changed to {} seconds, reloading timer",
>>                      interval
>>                  );
>> -                timer = Self::setup_timer(interval);
>> +                (timer, next_run) = Self::setup_timer(interval);
>> +                // If change (and therefore reset) our timer right before it fires,
>> +                // we could potentially miss one collection event.
> 
> Couldn't we instead just pass `next_run` through to `setup_timer` and
> call `reset_at(next_run)` on it? (`first_run` would only be used in the
> initial setup, so `next_run` could either be an `Option`, or the setup
> code does the `next_aligned_instant` call...
> 
> This should be much less code by making the new
> `fetch_overdue{,_and_save_sate}()` functions unnecessary, or am I
> missing something?
> 

I guess the question is, do we want nicely aligned timer ticks?

e.g. 14:01:00, 14:02:00, 14:03:00 ... for 60 second interval
or   14:00:00, 14:05:00, 14:10:00 ... for a 5 minute interval?

Because that was the main intention behind using the 'collection-interval' as
a base for calculating the aligned instant for the first timer reset.
If we reuse the 'old' `next_run` when the interval is changed, we
also reuse the old alignment. 

For instance, when changing from initially 1 minute to 5 minutes, the
timer ticks might come at 
  14:01:00, 14:06:00, 14:11:00

Technically, the naming for the `next_run` variable is not the best,
since it just contains the Instant when the timer *first* fires, but
this is then never updated to the *next* time the timer will fire...
So that means that when changing the interval with your suggested change,
you'd pass an Instant to `reset_at` that is already in the past,
causing the timer to fire immediately.

If we *don't* care about the aligned ticks as described above, we could
just use a static alignment boundary, e.g. 60 seconds.
In this case we can also get rid of the fetch_overdue stuff, since
at worst case we have 60 seconds until the next tick on startup or timer change,
which should be good enough to prevent any significant gaps in the data.


-- 
- Lukas





More information about the pdm-devel mailing list