[pve-devel] wrong vm graph stats if vm is shutdown then restart

mlb-pve at corefiling.co.uk mlb-pve at corefiling.co.uk
Mon May 28 14:11:40 CEST 2012

On 24/05/12 05:51, Dietmar Maurer wrote:
>> network traffic and disk stats show about 400 Petabyte ;)
>> Maybe this is because a new kvm process is created with 0 stats.
> I guess rrd detect the counter overflow, and computes the difference which lead to such overflow.

that is almost certainly what it's doing, unfortunately.  this is one of 
the trade-offs of using the COUNTER data-source type.

> This is obviously a very big value, and totally wrong.
> Maybe there is a flag, or other counter type in RRD with other behavior?

having encountered this problem when working on another RRD-based 
project (Munin), the two solutions/workarounds i'm aware of are:

+ setting a .max value for the data-source.  of course, selecting a 
value that will be generally sane is non-trivial.  the first data-point 
after every reset will be lost.

+ changing the data-source type to DERIVE rather than COUNTER, along 
with .min=0 so you don't get huge negative spikes when it wraps or 
resets.  in theory this means you'll lose a data-point every time it 
wraps, as well as when it resets; in practice this is going to be 
extremely unusual on 64-bit systems:

"On a 100 Gb/s interface running at line rate, they will wrap once every 
46.8 years. Even on a 100 Tb/s interface, wraps won't occur more 
frequently than every 17 days."
   --from <http://kb.pert.geant.net/PERTKB/SnmpCounterWrap>, who should 
know a bit about these things ;-)

see the "NOTE on COUNTER vs DERIVE" section of 
<http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html> for a slightly 
more detailed discussion.

there are also some notes on manual, after-the-fact, spike removal here, 
which may possibly be interesting: 


Matthew Boyle, Systems Administrator, CoreFiling Limited
Telephone: +44-1865-203192  Website: http://www.corefiling.com

More information about the pve-devel mailing list