[pve-devel] [PATCH cluster] add get_guest_config_property IPCC method

Wolfgang Bumiller w.bumiller at proxmox.com
Wed Jun 12 13:12:54 CEST 2019


On Tue, Jun 11, 2019 at 06:02:22AM +0200, Thomas Lamprecht wrote:
> This adds a new method to our IPCC interface.
> It's a helper to get a property of a single or all guests.
> 
> It is restricted to only look at the current config state only, i.e.,
> no PENDING changes and no snapshots, this is by design and wanted.
> It uses the strict config format to be quick and return/continue
> early, those restrictions are imposed by
> PVE::QemuServer::parse_vm_config, and the container counterpart, it
> mostly boils down to the following regex:
> /^([a-z][a-z_]*\d*):\s*(.+?)\s*$/
> and the fact that this is applied directly on the config lines (no
> whitespace trimming, in any way, happens before)
> 
> Motivation for this work is to get the lock state of all guests
> _quick_, allowing us to pass this info in our resource call, and
> enhance the tree view of our Web User Interface with that
> information. It was kept a bit more general, without bending the code
> to much. The call returns a serialized JSON object with the format:
> VMID => { PROPERTY => PROPERTY_VALUE }
> if the property was found in VMID configuration, a empty object will
> be returned if it was not found.
> 
> If one passes 0 to the request all VM configurations will be
> searched, only those with a match will be returned, in the same
> manner as above.
> 
> So why a IPCC call and not perl handling of this? Well performance.
> Dominik's proposed a perl + cfs_read_file approach[0], while this is
> relatively short and in the (mostly) safer perl land it's pretty
> slow, especially on first connections. The idea for this existed
> since quite a bit in my head, but Dominik's patch put it in front and
> a prototype of this was born soon after that, initial evaluation and
> performance comparison showed a speedup of >100 times at the initial
> gathering, as [0] really benefits from the ccache afterwards, and
> that _is_ a cache which gets often used and hit more "serial runs"
> (i.e., in same worker) make his approach more reasonable, though.
> But after a bit of work this came in not to ugly shape, and here the
> numbers, with 10005 VM configs, all "real" ones with about 502 bytes
> space usage, and all with a lock in the worst place, at the end.
> 
> Legend:
> C0 : how many "serial runs" took place, i.e., how many runs in the
>      same worker
> C1: IPCC (this) approach total runtime in ms
> C2: IPCC (this) approach per-run runtime in ms, i.e., C1 / C0
> C3: Perl + cfs_read_file ([0]) approach total runtime
> C4: Perl + cfs_read_file ([0]) approach per-run runtime, i.e., C3 / C0
> 
> Data:
> C0	  C1 	  C2 	   C3  	  C4
> 1	18.31	18.31	3570.29	3570.29
> 2	31.51	15.76	3717.69	1858.84
> 3	44.19	14.73	3821.84	1273.95
> 4	58.54	14.63	3950.24	987.56
> 5	70.31	14.06	4071.42	814.28
> 6	95.29	15.88	4175.95	695.99
> 7	95.87	13.70	4192.35	598.91
> 8	111.81	13.98	4346.84	543.36
> 9	120.84	13.43	4432.13	492.46
> 10	134.52	13.45	4554.25	455.42
> 11	149.74	13.61	4673.71	424.88
> 12	161.13	13.43	4797.56	399.80
> 13	172.74	13.29	4892.15	376.32
> 14	180.63	12.90	4951.17	353.66
> 15	199.04	13.27	5034.51	335.63
> 
> So, initially C beats Perl (not really correct wording, but hey) by
> >200 times. But, on the second run we immediately see that while IPCC
> scales almost linear the perl one doesn't, it really benefits from
> the cache now, while initial call needed 3.5s this "only" needs ~200
> ms more. But 200ms is still quite a bit of addition to an API call,
> 10k VMs are not seen to often in the wild, I guess, albeit it's
> easily possible with setups using a lot of containers, here we know
> about users having ~1500 CTs on only two nodes, so not completely
> unrealistic either.
> 
> [0]: https://pve.proxmox.com/pipermail/pve-devel/2019-February/035830.html
> 
> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
> ---
>  data/PVE/Cluster.pm    |  15 +++++
>  data/src/cfs-ipc-ops.h |   2 +
>  data/src/server.c      |  28 ++++++++
>  data/src/status.c      | 148 +++++++++++++++++++++++++++++++++++++++++
>  data/src/status.h      |   3 +
>  5 files changed, 196 insertions(+)
> 
> diff --git a/data/PVE/Cluster.pm b/data/PVE/Cluster.pm
> index 7f9b88e..d02a867 100644
> --- a/data/PVE/Cluster.pm
> +++ b/data/PVE/Cluster.pm
> @@ -595,6 +595,21 @@ sub get_node_kv {
>      return $res;
>  }
>  
> +# property: a config property you want to get, e.g., this is perfect to get
> +# the 'lock' entry of a guest _fast_ (>100 faster than manual parsing here)
> +# vmid: optipnal, if a valid is passed we only check that one, else return all
> +# NOTE: does *not* searches snapshot and PENDING entries sections!
> +sub get_guest_config_property {
> +    my ($property, $vmid) = @_;
> +
> +    die "property is required" if !defined($property);
> +
> +    my $bindata = pack "VZ*", $vmid // 0, $property;
> +    my $res = $ipcc_send_rec_json->(CFS_IPC_GET_GUEST_CONFIG_PROPERTY, $bindata);
> +
> +    return $res;
> +}
> +
>  # $data must be a chronological descending ordered array of tasks
>  sub broadcast_tasklist {
>      my ($data) = @_;
> diff --git a/data/src/cfs-ipc-ops.h b/data/src/cfs-ipc-ops.h
> index a1bf51d..e4026ad 100644
> --- a/data/src/cfs-ipc-ops.h
> +++ b/data/src/cfs-ipc-ops.h
> @@ -39,4 +39,6 @@
>  
>  #define CFS_IPC_GET_RRD_DUMP 10
>  
> +#define CFS_IPC_GET_GUEST_CONFIG_PROPERTY 11
> +
>  #endif
> diff --git a/data/src/server.c b/data/src/server.c
> index 54289c7..c947091 100644
> --- a/data/src/server.c
> +++ b/data/src/server.c
> @@ -83,6 +83,11 @@ typedef struct {
>  	uint32_t res3;
>  } cfs_log_get_request_header_t;
>  
> +typedef struct {
> +	struct qb_ipc_request_header req_header;
> +	uint32_t vmid;
> +	char property[];
> +} cfs_guest_config_propery_get_request_header_t;
>  
>  struct s1_context {
>  	int32_t client_pid;
> @@ -313,6 +318,29 @@ static int32_t s1_msg_process_fn(
>  			cfs_rrd_dump(outbuf);
>  			result = 0;
>  		}
> +	} else if (req_pt->id == CFS_IPC_GET_GUEST_CONFIG_PROPERTY) {
> +
> +		cfs_guest_config_propery_get_request_header_t *rh =
> +			(cfs_guest_config_propery_get_request_header_t *) data;
> +
> +		int proplen = req_pt->size - G_STRUCT_OFFSET(cfs_guest_config_propery_get_request_header_t, property);
> +
> +		result = 0;
> +		if (rh->vmid < 100 && rh->vmid != 0) {
> +			cfs_debug("vmid out of range %u", rh->vmid);
> +			result = -EINVAL;
> +		} else if (rh->vmid >= 100 && !vmlist_vm_exists(rh->vmid)) {
> +			result = -ENOENT;
> +		} else if (proplen <= 0) {
> +			cfs_debug("proplen <= 0, %d", proplen);
> +			result = -EINVAL;
> +		} else {
> +			((char *)data)[req_pt->size] = 0; // ensure property is 0 terminated
> +
> +			cfs_debug("cfs_get_guest_config_property: basic valid checked, do request");
> +
> +			result = cfs_create_guest_conf_property_msg(outbuf, memdb, rh->property, rh->vmid);
> +		}
>  	}
>  
>  	cfs_debug("process result %d", result);
> diff --git a/data/src/status.c b/data/src/status.c
> index 177332d..d07cbb9 100644
> --- a/data/src/status.c
> +++ b/data/src/status.c
> @@ -33,9 +33,11 @@
>  #include <rrd.h>
>  #include <rrd_client.h>
>  #include <time.h>
> +#include <ctype.h>
>  
>  #include "cfs-utils.h"
>  #include "status.h"
> +#include "memdb.h"
>  #include "logger.h"
>  
>  #define KVSTORE_CPG_GROUP_NAME "pve_kvstore_v1"
> @@ -171,6 +173,27 @@ static const char *vminfo_type_to_string(vminfo_t *vminfo) {
>      }
>  }
>  
> +static const char *vminfo_type_to_path_type(vminfo_t *vminfo) {
> +    if (vminfo->vmtype == VMTYPE_QEMU) {
> +        return "qemu-server"; // special case..
> +    } else {
> +        return vminfo_type_to_string(vminfo);
> +    }
> +}
> +int vminfo_to_path(vminfo_t *vminfo, GString *path)
> +{
> +	g_return_val_if_fail(vminfo != NULL, -1);
> +	g_return_val_if_fail(path != NULL, -1);
> +
> +	if (!vminfo->nodename)
> +		return 0;
> +
> +	const char *type = vminfo_type_to_path_type(vminfo);
> +	g_string_printf(path, "/nodes/%s/%s/%u.conf", vminfo->nodename, type, vminfo->vmid);
> +
> +	return 1;
> +}
> +
>  void cfs_clnode_destroy(
>  	cfs_clnode_t *clnode)
>  {
> @@ -763,6 +786,131 @@ cfs_create_vmlist_msg(GString *str)
>  	return 0;
>  }
>  
> +// checks the conf for a line starting with '$prop:' and returns the value
> +// afterwards, whitout initial whitespace(s), we only deal with the format
> +// restricion imposed by our perl VM config parser, main reference is
> +// PVE::QemuServer::parse_vm_config this allows to be very fast and still
> +// relatively simple
> +// main restrictions used for our advantage is the properties match reges:
> +// ($line =~ m/^([a-z][a-z_]*\d*):\s*(.+?)\s*$/) from parse_vm_config
> +// currently we only look at the current configuration in place, i.e., *no*
> +// snapshort and *no* pending changes
> +static char *
> +_get_property_value(char *conf, char *prop, int prop_len)

prop should be const

> +{
> +	char *line, *temp;

initializing `temp` to NULL shutsup a warning gcc seems to spit out on
my end here...
status.c:816:10: error: ‘__s’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    char *v_start = &line[prop_len + 1];

`__s` seems to come from inlining strtok or some such, a gcc bug
apparently...


> +
> +	line = strtok_r(conf, "\n", &temp);
> +	while (line != NULL) {
> +		if (!line[0]) goto next;
> +
> +		// snapshot or pending section start and nothing found yet
> +		if (line[0] == '[') return NULL;
> +		// properties start with /^[a-z]/, so continue early if not
> +		if (line[0] < 'a' || line[0] > 'z') goto next;
> +
> +		int line_len = strlen(line);
> +		if (line_len <= prop_len + 1) goto next;
> +
> +		if (line[prop_len] == ':' && strncmp(line, prop, prop_len) == 0) { // found

This strncmp could be a memcmp given all the checks above (particularly
the strlen() check). strncmp() checks for null bytes, memcmp doesn't
care (and we already verified with strlen() that there aren't any null
bytes within [0..prop_len].

> +			char *v_start = &line[prop_len + 1];
> +
> +			// drop initial value whitespaces here already
> +			while (*v_start && isspace(*v_start)) v_start++;
> +
> +			// TODO: drop trailing whitespace here too??

We already have `line_len`, so going backwards from that should be
efficient enough.

> +
> +			if (!*v_start) return NULL;
> +			return v_start;
> +		}
> +next:
> +		line = strtok_r(NULL, "\n", &temp);
> +	}
> +
> +	return NULL; // not found
> +}
> +
> +int
> +cfs_create_guest_conf_property_msg(GString *str, memdb_t *memdb, char *prop, uint32_t vmid)

prop should be const

> +{
> +	g_return_val_if_fail(cfs_status.vmlist != NULL, -EINVAL);
> +	g_return_val_if_fail(str != NULL, -EINVAL);
> +
> +	int prop_len = strlen(prop);
> +	int res = 0;
> +	GString *path = NULL;
> +
> +	g_mutex_lock (&mutex);
> +
> +	g_string_printf(str,"{\n");
> +
> +	GHashTable *ht = cfs_status.vmlist;
> +	if (!g_hash_table_size(ht)) {
> +		goto ret;
> +	}
> +
> +	path = g_string_new_len(NULL, 256);
> +	if (vmid >= 100) {
> +		vminfo_t *vminfo = (vminfo_t *) g_hash_table_lookup(cfs_status.vmlist, &vmid);
> +		if (vminfo == NULL) goto enoent;
> +
> +		if (!vminfo_to_path(vminfo, path)) goto err;
> +
> +		gpointer tmp = NULL;
> +		int size = memdb_read(memdb, path->str, &tmp);
> +		if (tmp == NULL || size <= prop_len) goto err;

The 'or' part of the condition would have a valid `tmp` pointer needing
to be `g_free()`d, no? So maybe just unconditionally call free before
the goto?

> +
> +		char *val = _get_property_value(tmp, prop, prop_len);
> +		if (val == NULL) {
> +			g_free(tmp);
> +			goto ret;
> +		}
> +
> +		g_string_append_printf(str,"\"%u\": { \"%s\": \"%s\"\n }", vmid, prop, val);

Should we not sanity-check the value for double quotes here?

> +		g_free(tmp);
> +
> +	} else {
> +		GHashTableIter iter;
> +		g_hash_table_iter_init (&iter, ht);
> +
> +		gpointer key, value;
> +		int first = 1;
> +		while (g_hash_table_iter_next (&iter, &key, &value)) {
> +			vminfo_t *vminfo = (vminfo_t *)value;
> +
> +			if (!vminfo_to_path(vminfo, path)) goto err;
> +
> +			gpointer tmp = NULL;
> +			int size = memdb_read(memdb, path->str, &tmp);
> +			if (tmp == NULL || size <= prop_len) continue;

Same g_free(tmp) as above

> +
> +			char *val = _get_property_value(tmp, prop, prop_len);
> +			if (val == NULL) {
> +				g_free(tmp);
> +				continue;
> +			}
> +
> +			if (!first) g_string_append_printf(str, ",\n");
> +			else first = 0;
> +
> +			g_string_append_printf(str,"\"%u\": {\"%s\": \"%s\"}", vminfo->vmid, prop, val);
> +			g_free(tmp);
> +		}
> +	}
> +ret:
> +	g_string_free(path, TRUE);
> +	g_string_append_printf(str,"\n}\n");
> +	g_mutex_unlock (&mutex);
> +
> +	return res;
> +err:
> +	res = -1;

I'm not a fan of mixing hardcoded values and errno values. This
corresponds to EPERM which is a rather weird error in those cases.
Granted, there is often no good error code, so I'd just stick to EINVAL
(or EIO to distinguish it from the preconditions).

> +	goto ret;
> +enoent:
> +	res = -ENOENT;
> +	goto ret;
> +}
> +
>  void
>  record_memdb_change(const char *path)
>  {
> diff --git a/data/src/status.h b/data/src/status.h
> index 2e4153a..bda3e36 100644
> --- a/data/src/status.h
> +++ b/data/src/status.h
> @@ -25,6 +25,7 @@
>  
>  #include "dfsm.h"
>  #include "logger.h"
> +#include "memdb.h"
>  
>  #define VMTYPE_QEMU 1
>  #define VMTYPE_OPENVZ 2
> @@ -158,5 +159,7 @@ int
>  cfs_create_memberlist_msg(
>  	GString *str);
>  
> +int
> +cfs_create_guest_conf_property_msg(GString *str, memdb_t *memdb, char *prop, uint32_t vmid);
>  
>  #endif /* _PVE_STATUS_H_ */
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel




More information about the pve-devel mailing list