[pve-devel] [RFC container] Improve feedback for startup
Wolfgang Bumiller
w.bumiller at proxmox.com
Thu Aug 27 10:44:04 CEST 2020
On Thu, Aug 20, 2020 at 11:36:39AM +0200, Thomas Lamprecht wrote:
> On 19.08.20 12:30, Fabian Ebner wrote:
> > Since it was necessary to switch to 'Type=Simple' in the systemd
> > service (see 545d6f0a13ac2bf3a8d3f224c19c0e0def12116d ),
> > 'systemctl start pve-container at ID' would not wait for the 'lxc-start'
> > command anymore. Thus every container start was reported as a success
> > and the 'post-start' hook would trigger immediately after the
> > 'systemctl start' command.
> >
> > Use 'lxc-monitor' to get the necessary information and detect
> > startup failure and only run the 'post-start' hookscript after
> > the container is effectively running. If something goes wrong
> > with the monitor, fall back to the old behavior.
> >
> > Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
> > ---
> > src/PVE/LXC.pm | 36 +++++++++++++++++++++++++++++++++++-
> > 1 file changed, 35 insertions(+), 1 deletion(-)
> >
>
> appreciate the effort!
> We could also directly connect to /run/lxc/var/lib/lxc/monitor-fifo (or the abstract
> unix socket, but not much gained/difference here) of the lxc-monitord which publishes
> all state changes and unpack the new state [0] directly.
>
> [0] https://github.com/lxc/lxc/blob/8bdacc22a48f9c09902a1d2febd71439cb38c082/src/lxc/state.h#L10
>
> @Wolfgang, what do you think?
Just tested adding a state client to our Command.pm directly, seems to
work, so we would depend neither on lxc-monitor nor lxc-monitord.
Example & code follow below. The only issue with it is that we'd need to
retry connecting to the command socket a few times since we don't know
when it becomes available, but that shouldn't be too bad IMO.
But first some inline comments to the original patch.
> > diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm
> > index db5b8ca..35dc54c 100644
> > --- a/src/PVE/LXC.pm
> > +++ b/src/PVE/LXC.pm
> > @@ -2191,10 +2191,44 @@ sub vm_start {
> >
> > PVE::Storage::activate_volumes($storage_cfg, $vollist);
> >
> > + my $monitor_pid = open(my $monitor_fh, '-|', "/usr/bin/lxc-monitor -n $vmid")
> > + or warn "could not open pipe to lxc-monitor\n";
> > +
> > my $cmd = ['systemctl', 'start', "pve-container\@$vmid"];
> >
> > PVE::GuestHelpers::exec_hookscript($conf, $vmid, 'pre-start', 1);
> > - eval { PVE::Tools::run_command($cmd); };
> > + eval {
> > + PVE::Tools::run_command($cmd);
> > +
> > + my $success;
> > + if ($monitor_pid) {
> > + eval {
> > + local $SIG{ALRM} = sub { die "got timeout\n" };
> > + alarm(10); # 'STARTING' should appear quickly
We have a `run_with_timeout` helper in Tools which is more robust than
this. There can be all sorts of interactions between alarms, `eval` and
the rest of the code. For one, this overrides a previous alarm without
restoring it.
> > +
> > + while (my $line = <$monitor_fh>) {
> > + if ($line =~ m/^'$vmid' changed state to \[([A-Z]*)\]$/) {
> > + my $status = $1;
> > + alarm(0);
> > + $success = 1 if $status eq 'RUNNING';
> > + $success = 0 if $status eq 'ABORTING'
> > + || $status eq 'STOPPING'
> > + || $status eq 'STOPPED';
> > + if (defined($success)) {
> > + kill('KILL', $monitor_pid);
> > + waitpid($monitor_pid, 0);
> > + }
> > + } else {
> > + die "unexpected output from lxc-monitor: $line\n";
> > + }
> > + }
> > + };
> > + warn "Problem with lxc-monitor: $@" if $@;
Secondly, if we reach this part successfully, the alarm can still
trigger *HERE* which is already outside the `eval`, where we'd throw
back the `got timeout` message instead of the `lxc-start failed ...`
message, which could get potentially confusing. We do have a 2nd eval
around in this code, but that's a little "far". Eg. the 'unexpected
output from lxc-monitor' error might get swallowed if it triggers and
the alarm goes off before the above `warn` line, replacing the current
contents of '$@'... interrupts are pure pain.
> > + alarm(0);
> > + }
> > + die "'lxc-start' failed for container '$vmid'\n"
> > + if defined($success) && !$success;
> > + };
> > if (my $err = $@) {
> > unlink $skiplock_flag_fn;
> > die $err;
> >
Usage example:
use PVE::LXC::Command;
my $sock = PVE::LXC::Command::get_state_client(404);
die "not running\n" if !defined($sock);
while (1) {
my ($type, $name, $value) = PVE::LXC::Command::read_lxc_message($sock);
last if !defined($type);
print("$name: $type => $value\n");
}
Patch for Command.pm:
---8<---
>From 6ac578ef889a3a9c8aefc4f05215b4ec66049546 Mon Sep 17 00:00:00 2001
From: Wolfgang Bumiller <w.bumiller at proxmox.com>
Date: Thu, 27 Aug 2020 10:31:06 +0200
Subject: [PATCH container] command: add state client functions
Signed-off-by: Wolfgang Bumiller <w.bumiller at proxmox.com>
---
src/PVE/LXC/Command.pm | 91 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 91 insertions(+)
diff --git a/src/PVE/LXC/Command.pm b/src/PVE/LXC/Command.pm
index beed890..6df767d 100644
--- a/src/PVE/LXC/Command.pm
+++ b/src/PVE/LXC/Command.pm
@@ -11,20 +11,36 @@ use warnings;
use IO::Socket::UNIX;
use Socket qw(SOCK_STREAM SOL_SOCKET SO_PASSCRED);
+use POSIX qw(NAME_MAX);
use base 'Exporter';
use constant {
+ LXC_CMD_GET_STATE => 3,
LXC_CMD_GET_CGROUP => 6,
+ LXC_CMD_ADD_STATE_CLIENT => 10,
LXC_CMD_FREEZE => 15,
LXC_CMD_UNFREEZE => 16,
LXC_CMD_GET_LIMITING_CGROUP => 19,
};
+use constant {
+ STATE_STOPPED => 0,
+ STATE_STARTING => 1,
+ STATE_RUNNING => 2,
+ STATE_STOPPING => 3,
+ STATE_ABORTING => 4,
+ STATE_FREEZING => 5,
+ STATE_FROZEN => 6,
+ STATE_THAWED => 7,
+ MAX_STATE => 8,
+};
+
our @EXPORT_OK = qw(
raw_command_transaction
simple_command
get_cgroup_path
+ get_state_client
);
# Get the command socket for a container.
@@ -81,6 +97,33 @@ my sub _unpack_lxc_cmd_rsp($) {
return ($ret, $len);
}
+my $LXC_MSG_SIZE = length(pack('I! Z'.(NAME_MAX+1).' x![I] I', 0, "", 0));
+# Unpack an lxc_msg struct.
+my sub _unpack_lxc_msg($) {
+ my ($packet) = @_;
+
+ # struct lxc_msg {
+ # lxc_msg_type_t type;
+ # char name[NAME_MAX+1];
+ # int value;
+ # };
+
+ my ($type, $name, $value) = unpack('I!Z'.(NAME_MAX+1).'I!', $packet);
+
+ if ($type == 0) {
+ $type = 'STATE';
+ } elsif ($type == 1) {
+ $type = 'PRIORITY';
+ } elsif ($type == 2) {
+ $type = 'EXITCODE';
+ } else {
+ warn "unsupported lxc message type $type received\n";
+ $type = undef;
+ }
+
+ return ($type, $name, $value);
+}
+
# Send a complete packet:
my sub _do_send($$) {
my ($sock, $data) = @_;
@@ -206,4 +249,52 @@ sub unfreeze($$) {
return $res;
}
+# Add this command socket as a state client.
+#
+# Currently all states are observed.
+#
+# Returns undef if the container is not running, dies on errors.
+sub get_state_client($) {
+ my ($vmid) = @_;
+
+ my $socket = _get_command_socket($vmid)
+ or return undef;
+
+ # For now we want all states (except 'reboots', since we would never see those, reboots would
+ # use a value of '2' for STATE_RUNNING)
+ my $states = pack('I!', 2) x MAX_STATE;
+
+ my ($res, undef) = raw_command_transaction($socket, LXC_CMD_ADD_STATE_CLIENT, $states);
+ if ($res != MAX_STATE) {
+ die "container is currently in unexpected state $res\n";
+ }
+
+ return $socket;
+}
+
+# Read an lxc message from a socket.
+#
+# Returns undef on EOF (if lxc exits).
+# Otherwise returns a (type, vmid, value) tuple.
+#
+# The returned 'type' currently can be 'STATE', 'PRIORITY' or 'EXITSTATUS'.
+sub read_lxc_message($) {
+ my ($socket) = @_;
+
+ my $msg;
+ my $got = recv($socket, $msg, $LXC_MSG_SIZE, 0)
+ // die "failed to read from state socket: $!\n";
+
+ if (length($msg) == 0) {
+ return undef;
+ }
+
+ die "short read on state socket ($LXC_MSG_SIZE != ".length($msg).")\n"
+ if length($msg) != $LXC_MSG_SIZE;
+
+ my ($type, $name, $value) = _unpack_lxc_msg($msg);
+
+ return ($type, $name, $value);
+}
+
1;
--
2.20.1
More information about the pve-devel
mailing list