[pve-devel] [RFC cluster/manager] prepare for cluster wide ceph dashboard
Dominik Csapak
d.csapak at proxmox.com
Fri Apr 26 08:21:24 CEST 2019
in order to have a better ceph dashboard that is available cluster-wide,
we have to add a few things:
* cluster-wide status api call
not that hard, since in a hyperconverged setup we always have the
info about the monitor and how to connect there
* a list of existing services
ceph only manages monitors and osds that exists, but does not care
about mds or mgr, especially if they are not running
we did put the mons/mgr/mds into the ceph.conf, but this is not
mandatory for a working ceph setup
i implemented this with a cluster-wide synced list of existing
systemd units for those types of services (mon/mgr/mds) so we can
show which services are enabled where, independent of cephs status
and config
this way a user can see if there is any wrong service left over,
can see services that are not started (and not in the config) or can
see that a service is running, but is not enabled (and thus would
not be running after e.g. a restart)
* a list of versions of the services
this is also not that hard and accomplished with a call to
'YYY metadata' via RADOS. There we get the versions of running
services including their name, host and version
with this information, we can warn the user that some services are
running an older version, and they can restart them
sending it as RFC, since there are following things i am not so sure
about, and wanted comments before i begin with the work on the gui part
* the cluster sync interface
i am not so sure if this is the best way, but we wanted such a thing
a few times now and it seems to work pretty well
we just have to be careful how we use this to not fill pmxcfs with
unecessary things or rely to much on them
* the service/metadata structure merging
i mangle the 'YYY metadata' and service lists into a single
structure so that i can later process them in the gui.
i am not really happy how this is done, but could not think of a
better way (i tried several things)
the only way left that may make things better is to abandond a
generic data broadcast interface, and write one especially for this
case, though i do not really like the idea to have ceph relevant
code in pve-cluster
* collecting unit info via links in /etc/systemd/system
as i see it, we have 3 options to get the existing services for ceph:
1 calling DBUS to query it (expensive, complicated)
2 parsing 'systemctl list-units GLOB' (complicated, potentially error-prone)
3 parsing symlinks in /etc/systemd/system (fast, relatively easy,
should be stable)
i opted for option 3 (for now), but if someone has another option,
or a compelling opinion for any other of the options, please share it
Dominik Csapak (1):
add generic data broadcast interface
data/PVE/Cluster.pm | 47 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
pve-manager:
Dominik Csapak (3):
add get_local_services for ceph
broadcast ceph service data to cluster
add cluster wide ceph api calls
PVE/API2/Cluster.pm | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
PVE/Ceph/Services.pm | 18 ++++++++++++++
PVE/Service/pvestatd.pm | 14 +++++++++++
3 files changed, 96 insertions(+)
--
2.11.0
More information about the pve-devel
mailing list