[pve-devel] wrong vm graph stats if vm is shutdown then restart
mlb-pve at corefiling.co.uk
mlb-pve at corefiling.co.uk
Mon May 28 14:11:40 CEST 2012
On 24/05/12 05:51, Dietmar Maurer wrote:
>> network traffic and disk stats show about 400 Petabyte ;)
>>
>> Maybe this is because a new kvm process is created with 0 stats.
>
> I guess rrd detect the counter overflow, and computes the difference which lead to such overflow.
that is almost certainly what it's doing, unfortunately. this is one of
the trade-offs of using the COUNTER data-source type.
> This is obviously a very big value, and totally wrong.
>
> Maybe there is a flag, or other counter type in RRD with other behavior?
having encountered this problem when working on another RRD-based
project (Munin), the two solutions/workarounds i'm aware of are:
+ setting a .max value for the data-source. of course, selecting a
value that will be generally sane is non-trivial. the first data-point
after every reset will be lost.
+ changing the data-source type to DERIVE rather than COUNTER, along
with .min=0 so you don't get huge negative spikes when it wraps or
resets. in theory this means you'll lose a data-point every time it
wraps, as well as when it resets; in practice this is going to be
extremely unusual on 64-bit systems:
"On a 100 Gb/s interface running at line rate, they will wrap once every
46.8 years. Even on a 100 Tb/s interface, wraps won't occur more
frequently than every 17 days."
--from <http://kb.pert.geant.net/PERTKB/SnmpCounterWrap>, who should
know a bit about these things ;-)
see the "NOTE on COUNTER vs DERIVE" section of
<http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html> for a slightly
more detailed discussion.
there are also some notes on manual, after-the-fact, spike removal here,
which may possibly be interesting:
<http://munin-monitoring.org/wiki/SpikeRemoval>.
--matt
--
Matthew Boyle, Systems Administrator, CoreFiling Limited
Telephone: +44-1865-203192 Website: http://www.corefiling.com
More information about the pve-devel
mailing list