[pve-devel] [PATCH cluster] add get_guest_config_property IPCC method

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Jun 11 06:02:22 CEST 2019


This adds a new method to our IPCC interface.
It's a helper to get a property of a single or all guests.

It is restricted to only look at the current config state only, i.e.,
no PENDING changes and no snapshots, this is by design and wanted.
It uses the strict config format to be quick and return/continue
early, those restrictions are imposed by
PVE::QemuServer::parse_vm_config, and the container counterpart, it
mostly boils down to the following regex:
/^([a-z][a-z_]*\d*):\s*(.+?)\s*$/
and the fact that this is applied directly on the config lines (no
whitespace trimming, in any way, happens before)

Motivation for this work is to get the lock state of all guests
_quick_, allowing us to pass this info in our resource call, and
enhance the tree view of our Web User Interface with that
information. It was kept a bit more general, without bending the code
to much. The call returns a serialized JSON object with the format:
VMID => { PROPERTY => PROPERTY_VALUE }
if the property was found in VMID configuration, a empty object will
be returned if it was not found.

If one passes 0 to the request all VM configurations will be
searched, only those with a match will be returned, in the same
manner as above.

So why a IPCC call and not perl handling of this? Well performance.
Dominik's proposed a perl + cfs_read_file approach[0], while this is
relatively short and in the (mostly) safer perl land it's pretty
slow, especially on first connections. The idea for this existed
since quite a bit in my head, but Dominik's patch put it in front and
a prototype of this was born soon after that, initial evaluation and
performance comparison showed a speedup of >100 times at the initial
gathering, as [0] really benefits from the ccache afterwards, and
that _is_ a cache which gets often used and hit more "serial runs"
(i.e., in same worker) make his approach more reasonable, though.
But after a bit of work this came in not to ugly shape, and here the
numbers, with 10005 VM configs, all "real" ones with about 502 bytes
space usage, and all with a lock in the worst place, at the end.

Legend:
C0 : how many "serial runs" took place, i.e., how many runs in the
     same worker
C1: IPCC (this) approach total runtime in ms
C2: IPCC (this) approach per-run runtime in ms, i.e., C1 / C0
C3: Perl + cfs_read_file ([0]) approach total runtime
C4: Perl + cfs_read_file ([0]) approach per-run runtime, i.e., C3 / C0

Data:
C0	  C1 	  C2 	   C3  	  C4
1	18.31	18.31	3570.29	3570.29
2	31.51	15.76	3717.69	1858.84
3	44.19	14.73	3821.84	1273.95
4	58.54	14.63	3950.24	987.56
5	70.31	14.06	4071.42	814.28
6	95.29	15.88	4175.95	695.99
7	95.87	13.70	4192.35	598.91
8	111.81	13.98	4346.84	543.36
9	120.84	13.43	4432.13	492.46
10	134.52	13.45	4554.25	455.42
11	149.74	13.61	4673.71	424.88
12	161.13	13.43	4797.56	399.80
13	172.74	13.29	4892.15	376.32
14	180.63	12.90	4951.17	353.66
15	199.04	13.27	5034.51	335.63

So, initially C beats Perl (not really correct wording, but hey) by
>200 times. But, on the second run we immediately see that while IPCC
scales almost linear the perl one doesn't, it really benefits from
the cache now, while initial call needed 3.5s this "only" needs ~200
ms more. But 200ms is still quite a bit of addition to an API call,
10k VMs are not seen to often in the wild, I guess, albeit it's
easily possible with setups using a lot of containers, here we know
about users having ~1500 CTs on only two nodes, so not completely
unrealistic either.

[0]: https://pve.proxmox.com/pipermail/pve-devel/2019-February/035830.html

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---
 data/PVE/Cluster.pm    |  15 +++++
 data/src/cfs-ipc-ops.h |   2 +
 data/src/server.c      |  28 ++++++++
 data/src/status.c      | 148 +++++++++++++++++++++++++++++++++++++++++
 data/src/status.h      |   3 +
 5 files changed, 196 insertions(+)

diff --git a/data/PVE/Cluster.pm b/data/PVE/Cluster.pm
index 7f9b88e..d02a867 100644
--- a/data/PVE/Cluster.pm
+++ b/data/PVE/Cluster.pm
@@ -595,6 +595,21 @@ sub get_node_kv {
     return $res;
 }
 
+# property: a config property you want to get, e.g., this is perfect to get
+# the 'lock' entry of a guest _fast_ (>100 faster than manual parsing here)
+# vmid: optipnal, if a valid is passed we only check that one, else return all
+# NOTE: does *not* searches snapshot and PENDING entries sections!
+sub get_guest_config_property {
+    my ($property, $vmid) = @_;
+
+    die "property is required" if !defined($property);
+
+    my $bindata = pack "VZ*", $vmid // 0, $property;
+    my $res = $ipcc_send_rec_json->(CFS_IPC_GET_GUEST_CONFIG_PROPERTY, $bindata);
+
+    return $res;
+}
+
 # $data must be a chronological descending ordered array of tasks
 sub broadcast_tasklist {
     my ($data) = @_;
diff --git a/data/src/cfs-ipc-ops.h b/data/src/cfs-ipc-ops.h
index a1bf51d..e4026ad 100644
--- a/data/src/cfs-ipc-ops.h
+++ b/data/src/cfs-ipc-ops.h
@@ -39,4 +39,6 @@
 
 #define CFS_IPC_GET_RRD_DUMP 10
 
+#define CFS_IPC_GET_GUEST_CONFIG_PROPERTY 11
+
 #endif
diff --git a/data/src/server.c b/data/src/server.c
index 54289c7..c947091 100644
--- a/data/src/server.c
+++ b/data/src/server.c
@@ -83,6 +83,11 @@ typedef struct {
 	uint32_t res3;
 } cfs_log_get_request_header_t;
 
+typedef struct {
+	struct qb_ipc_request_header req_header;
+	uint32_t vmid;
+	char property[];
+} cfs_guest_config_propery_get_request_header_t;
 
 struct s1_context {
 	int32_t client_pid;
@@ -313,6 +318,29 @@ static int32_t s1_msg_process_fn(
 			cfs_rrd_dump(outbuf);
 			result = 0;
 		}
+	} else if (req_pt->id == CFS_IPC_GET_GUEST_CONFIG_PROPERTY) {
+
+		cfs_guest_config_propery_get_request_header_t *rh =
+			(cfs_guest_config_propery_get_request_header_t *) data;
+
+		int proplen = req_pt->size - G_STRUCT_OFFSET(cfs_guest_config_propery_get_request_header_t, property);
+
+		result = 0;
+		if (rh->vmid < 100 && rh->vmid != 0) {
+			cfs_debug("vmid out of range %u", rh->vmid);
+			result = -EINVAL;
+		} else if (rh->vmid >= 100 && !vmlist_vm_exists(rh->vmid)) {
+			result = -ENOENT;
+		} else if (proplen <= 0) {
+			cfs_debug("proplen <= 0, %d", proplen);
+			result = -EINVAL;
+		} else {
+			((char *)data)[req_pt->size] = 0; // ensure property is 0 terminated
+
+			cfs_debug("cfs_get_guest_config_property: basic valid checked, do request");
+
+			result = cfs_create_guest_conf_property_msg(outbuf, memdb, rh->property, rh->vmid);
+		}
 	}
 
 	cfs_debug("process result %d", result);
diff --git a/data/src/status.c b/data/src/status.c
index 177332d..d07cbb9 100644
--- a/data/src/status.c
+++ b/data/src/status.c
@@ -33,9 +33,11 @@
 #include <rrd.h>
 #include <rrd_client.h>
 #include <time.h>
+#include <ctype.h>
 
 #include "cfs-utils.h"
 #include "status.h"
+#include "memdb.h"
 #include "logger.h"
 
 #define KVSTORE_CPG_GROUP_NAME "pve_kvstore_v1"
@@ -171,6 +173,27 @@ static const char *vminfo_type_to_string(vminfo_t *vminfo) {
     }
 }
 
+static const char *vminfo_type_to_path_type(vminfo_t *vminfo) {
+    if (vminfo->vmtype == VMTYPE_QEMU) {
+        return "qemu-server"; // special case..
+    } else {
+        return vminfo_type_to_string(vminfo);
+    }
+}
+int vminfo_to_path(vminfo_t *vminfo, GString *path)
+{
+	g_return_val_if_fail(vminfo != NULL, -1);
+	g_return_val_if_fail(path != NULL, -1);
+
+	if (!vminfo->nodename)
+		return 0;
+
+	const char *type = vminfo_type_to_path_type(vminfo);
+	g_string_printf(path, "/nodes/%s/%s/%u.conf", vminfo->nodename, type, vminfo->vmid);
+
+	return 1;
+}
+
 void cfs_clnode_destroy(
 	cfs_clnode_t *clnode)
 {
@@ -763,6 +786,131 @@ cfs_create_vmlist_msg(GString *str)
 	return 0;
 }
 
+// checks the conf for a line starting with '$prop:' and returns the value
+// afterwards, whitout initial whitespace(s), we only deal with the format
+// restricion imposed by our perl VM config parser, main reference is
+// PVE::QemuServer::parse_vm_config this allows to be very fast and still
+// relatively simple
+// main restrictions used for our advantage is the properties match reges:
+// ($line =~ m/^([a-z][a-z_]*\d*):\s*(.+?)\s*$/) from parse_vm_config
+// currently we only look at the current configuration in place, i.e., *no*
+// snapshort and *no* pending changes
+static char *
+_get_property_value(char *conf, char *prop, int prop_len)
+{
+	char *line, *temp;
+
+	line = strtok_r(conf, "\n", &temp);
+	while (line != NULL) {
+		if (!line[0]) goto next;
+
+		// snapshot or pending section start and nothing found yet
+		if (line[0] == '[') return NULL;
+		// properties start with /^[a-z]/, so continue early if not
+		if (line[0] < 'a' || line[0] > 'z') goto next;
+
+		int line_len = strlen(line);
+		if (line_len <= prop_len + 1) goto next;
+
+		if (line[prop_len] == ':' && strncmp(line, prop, prop_len) == 0) { // found
+			char *v_start = &line[prop_len + 1];
+
+			// drop initial value whitespaces here already
+			while (*v_start && isspace(*v_start)) v_start++;
+
+			// TODO: drop trailing whitespace here too??
+
+			if (!*v_start) return NULL;
+			return v_start;
+		}
+next:
+		line = strtok_r(NULL, "\n", &temp);
+	}
+
+	return NULL; // not found
+}
+
+int
+cfs_create_guest_conf_property_msg(GString *str, memdb_t *memdb, char *prop, uint32_t vmid)
+{
+	g_return_val_if_fail(cfs_status.vmlist != NULL, -EINVAL);
+	g_return_val_if_fail(str != NULL, -EINVAL);
+
+	int prop_len = strlen(prop);
+	int res = 0;
+	GString *path = NULL;
+
+	g_mutex_lock (&mutex);
+
+	g_string_printf(str,"{\n");
+
+	GHashTable *ht = cfs_status.vmlist;
+	if (!g_hash_table_size(ht)) {
+		goto ret;
+	}
+
+	path = g_string_new_len(NULL, 256);
+	if (vmid >= 100) {
+		vminfo_t *vminfo = (vminfo_t *) g_hash_table_lookup(cfs_status.vmlist, &vmid);
+		if (vminfo == NULL) goto enoent;
+
+		if (!vminfo_to_path(vminfo, path)) goto err;
+
+		gpointer tmp = NULL;
+		int size = memdb_read(memdb, path->str, &tmp);
+		if (tmp == NULL || size <= prop_len) goto err;
+
+		char *val = _get_property_value(tmp, prop, prop_len);
+		if (val == NULL) {
+			g_free(tmp);
+			goto ret;
+		}
+
+		g_string_append_printf(str,"\"%u\": { \"%s\": \"%s\"\n }", vmid, prop, val);
+		g_free(tmp);
+
+	} else {
+		GHashTableIter iter;
+		g_hash_table_iter_init (&iter, ht);
+
+		gpointer key, value;
+		int first = 1;
+		while (g_hash_table_iter_next (&iter, &key, &value)) {
+			vminfo_t *vminfo = (vminfo_t *)value;
+
+			if (!vminfo_to_path(vminfo, path)) goto err;
+
+			gpointer tmp = NULL;
+			int size = memdb_read(memdb, path->str, &tmp);
+			if (tmp == NULL || size <= prop_len) continue;
+
+			char *val = _get_property_value(tmp, prop, prop_len);
+			if (val == NULL) {
+				g_free(tmp);
+				continue;
+			}
+
+			if (!first) g_string_append_printf(str, ",\n");
+			else first = 0;
+
+			g_string_append_printf(str,"\"%u\": {\"%s\": \"%s\"}", vminfo->vmid, prop, val);
+			g_free(tmp);
+		}
+	}
+ret:
+	g_string_free(path, TRUE);
+	g_string_append_printf(str,"\n}\n");
+	g_mutex_unlock (&mutex);
+
+	return res;
+err:
+	res = -1;
+	goto ret;
+enoent:
+	res = -ENOENT;
+	goto ret;
+}
+
 void
 record_memdb_change(const char *path)
 {
diff --git a/data/src/status.h b/data/src/status.h
index 2e4153a..bda3e36 100644
--- a/data/src/status.h
+++ b/data/src/status.h
@@ -25,6 +25,7 @@
 
 #include "dfsm.h"
 #include "logger.h"
+#include "memdb.h"
 
 #define VMTYPE_QEMU 1
 #define VMTYPE_OPENVZ 2
@@ -158,5 +159,7 @@ int
 cfs_create_memberlist_msg(
 	GString *str);
 
+int
+cfs_create_guest_conf_property_msg(GString *str, memdb_t *memdb, char *prop, uint32_t vmid);
 
 #endif /* _PVE_STATUS_H_ */
-- 
2.20.1





More information about the pve-devel mailing list