[pve-devel] [PATCH manager 1/4] ceph: add perf data cache helpers

Tue Nov 5 13:51:16 CET 2019

add a helper to cache the ceph performance data inside pmxcfs
with broadcast_node_kv, and also a helper to read it out

merge the data from all nodes that sent performance data

the '$perf_cache' variable actually serves two purposes,
the writer (will be pvestatd) uses it to broadcast only its values,
and the reader (pvedaemon) uses it to cache all nodes data

Signed-off-by: Dominik Csapak <d.csapak at proxmox.com>
---
merging the data on read seems like a good idea, since we have the data
and it should make little sense to throw any way, but i noticed some
weird glitches when the pvestat update calls are near each other, since
the timestamps are then e.g. ..01, ..02, ..03, ..11, ..12, etc.

this looks slightly weird on the extjs charts since you have some
clustered data point at some timestamps but not others...

we could of course always only use the data from the current node,
this way we would have a more consistent interval

 PVE/Ceph/Services.pm | 54 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/PVE/Ceph/Services.pm b/PVE/Ceph/Services.pm
index 45eb6c3f..70cb9b9a 100644
--- a/PVE/Ceph/Services.pm
+++ b/PVE/Ceph/Services.pm
@@ -360,4 +360,58 @@ sub destroy_mgr {
     return undef;
 }
 
+my $perf_cache = [];
+
+sub cache_perf_data {
+    my ($keep_seconds) = @_;
+
+    $keep_seconds //= 5*60; # 5 minutes
+
+    my $rados = PVE::RADOS->new();
+    my $status = $rados->mon_command({ prefix => "status" });
+
+    my $pgmap = $status->{pgmap};
+
+    my $time = time();
+    my $entry = {
+	time => $time,
+	ops_r => $pgmap->{read_op_per_sec},
+	ops_w => $pgmap->{write_op_per_sec},
+	bytes_r => $pgmap->{read_bytes_sec},
+	bytes_w => $pgmap->{write_bytes_sec},
+    };
+
+    push @$perf_cache, $entry;
+
+    # remove entries older than $keep_seconds
+    $perf_cache = [grep { $time - $_->{time} < $keep_seconds } @$perf_cache ];
+
+    my $data = encode_json($perf_cache);
+    PVE::Cluster::broadcast_node_kv("ceph-perf", $data);
+}
+
+sub get_cached_perf_data {
+
+    # only get new data if the already cached one is older than 10 seconds
+    if (scalar(@$perf_cache) > 0 && (time() - $perf_cache->[-1]->{time}) < 10) {
+	return $perf_cache;
+    }
+
+    my $raw = PVE::Cluster::get_node_kv("ceph-perf");
+
+    my $res = [];
+    my $times = {};
+
+    for my $host (keys %$raw) {
+	my $tmp = eval { decode_json($raw->{$host}) };
+	for my $entry (@$tmp) {
+	    my $etime = $entry->{time};
+	    push @$res, $entry if !$times->{$etime}++;
+	}
+    }
+
+    $perf_cache = [sort { $a->{time} <=> $b->{time} } @$res];
+    return $perf_cache;
+}
+
 1;
-- 
2.20.1