[pve-devel] [PATCH v4 ha-manager 1/3] cleanup backup & mounted locks after recovery (fixes #1100)

Thomas Lamprecht t.lamprecht at proxmox.com
Thu Sep 15 10:45:15 CEST 2016


This cleans up the backup and mounted locks after a service is
recovered from a failed node, else it may not start if a locked
action occurred during the node failure.
We allow deletion of backup and mounted locks as they are secure
to do so if the node which hosted the locked service is now fenced.
We do not allow snapshot lock deletion as this needs more manual
clean up, also they are normally triggered manually.
Further ignore migration locks for now, we should over think that but
as its a manual triggered action for now it should be OK to not
auto delete it.

We do cannot remove locks via the remove_lock method provided by
PVE::AbstractConfig, as this method is well behaved and does not
allows removing locks from VMs/CTs located on another node.  We also
do not want to adapt this method to allow arbitrary lock removing,
independent on which node the config is located, as this could cause
missuse in the future. After all one of our base principles is that
the node owns its VMs/CTs (and there configs) and only the owner
itself may change the status of a VM/CT.

The HA manager needs to be able to change the state of services
when a node failed and is also allowed to do so, but only if the
node is fenced and we need to recover a service from it.

So we (re)implement the remove lock functionality in the resource
plugins.
We call that only if a node was fenced, and only *previous* stealing
the service. After all our implication to remove a lock is that the
owner (the node) is fenced. After stealing it we already changed
owner, and the new owner is *not* fenced and thus our implication
does not hold anymore - the new owner may already do some stuff
with the service (config changes, etc.)

Add the respective log.expect output from the added test to enable
regression testing this issue.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---

changes since v3:
* rebased on current master

 src/PVE/HA/Manager.pm                    | 22 ++++++++++++++++
 src/PVE/HA/Resources.pm                  |  6 +++++
 src/PVE/HA/Resources/PVECT.pm            | 23 +++++++++++++++++
 src/PVE/HA/Resources/PVEVM.pm            | 23 +++++++++++++++++
 src/PVE/HA/Sim/Resources.pm              | 15 +++++++++++
 src/test/test-locked-service1/log.expect | 44 ++++++++++++++++++++++++++++++++
 6 files changed, 133 insertions(+)
 create mode 100644 src/test/test-locked-service1/log.expect

diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index c60df7c..e6dab7a 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -238,6 +238,26 @@ my $change_service_state = sub {
 		" to '${new_state}'$text_state");
 };
 
+# clean up a possible bad state from a recovered service to allow its start
+my $fence_recovery_cleanup = sub {
+    my ($self, $sid, $fenced_node) = @_;
+
+    my $haenv = $self->{haenv};
+
+    my (undef, $type, $id) = PVE::HA::Tools::parse_sid($sid);
+    my $plugin = PVE::HA::Resources->lookup($type);
+
+    # should not happen
+    die "unknown resource type '$type'" if !$plugin;
+
+    # locks may block recovery, cleanup those which are safe to remove after fencing
+    my $removable_locks = ['backup', 'mounted'];
+    if (my $removed_lock = $plugin->remove_locks($haenv, $id, $removable_locks, $fenced_node)) {
+	$haenv->log('warning', "removed leftover lock '$removed_lock' from recovered " .
+	            "service '$sid' to allow its start.");
+    }
+};
+
 # after a node was fenced this recovers the service to a new node
 my $recover_fenced_service = sub {
     my ($self, $sid, $cd) = @_;
@@ -264,6 +284,8 @@ my $recover_fenced_service = sub {
 	$haenv->log('info', "recover service '$sid' from fenced node " .
 		    "'$fenced_node' to node '$recovery_node'");
 
+	&$fence_recovery_cleanup($self, $sid, $fenced_node);
+
 	$haenv->steal_service($sid, $sd->{node}, $recovery_node);
 
 	# $sd *is normally read-only*, fencing is the exception
diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index 3836fc8..96d2f8f 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -124,6 +124,12 @@ sub check_running {
     die "implement in subclass";
 }
 
+sub remove_locks {
+    my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+    die "implement in subclass";
+}
+
 
 # package PVE::HA::Resources::IPAddr;
 
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index b6ebe2f..d1312ab 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -114,4 +114,27 @@ sub check_running {
     return PVE::LXC::check_running($vmid);
 }
 
+sub remove_locks {
+    my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+    $service_node = $service_node || $haenv->nodename();
+
+    my $conf = PVE::LXC::Config->load_config($id, $service_node);
+
+    return undef if !defined($conf->{lock});
+
+    foreach my $lock (@$locks) {
+	if ($conf->{lock} eq $lock) {
+	    delete $conf->{lock};
+
+	    my $cfspath = PVE::LXC::Config->cfs_config_path($id, $service_node);
+	    PVE::Cluster::cfs_write_file($cfspath, $conf);
+
+	    return $lock;
+	}
+    }
+
+    return undef;
+}
+
 1;
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 4c06df9..55d4368 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -116,4 +116,27 @@ sub check_running {
     return PVE::QemuServer::check_running($vmid, 1, $nodename);
 }
 
+sub remove_locks {
+    my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+    $service_node = $service_node || $haenv->nodename();
+
+    my $conf = PVE::QemuConfig->load_config($id, $service_node);
+
+    return undef if !defined($conf->{lock});
+
+    foreach my $lock (@$locks) {
+	if ($conf->{lock} eq $lock) {
+	    delete $conf->{lock};
+
+	    my $cfspath = PVE::QemuConfig->cfs_config_path($id, $service_node);
+	    PVE::Cluster::cfs_write_file($cfspath, $conf);
+
+	    return $lock;
+	}
+    }
+
+    return undef;
+}
+
 1;
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index fe82332..bccc0e6 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -124,4 +124,19 @@ sub migrate {
 }
 
 
+sub remove_locks {
+    my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+    my $sid = $self->type() . ":$id";
+    my $hardware = $haenv->hardware();
+
+    foreach my $lock (@$locks) {
+	if (my $removed_lock = $hardware->unlock_service($sid, $lock)) {
+	    return $removed_lock;
+	}
+    }
+
+    return undef;
+}
+
 1;
diff --git a/src/test/test-locked-service1/log.expect b/src/test/test-locked-service1/log.expect
new file mode 100644
index 0000000..6bc04a1
--- /dev/null
+++ b/src/test/test-locked-service1/log.expect
@@ -0,0 +1,44 @@
+info      0     hardware: starting simulation
+info     20      cmdlist: execute power node1 on
+info     20    node1/crm: status change startup => wait_for_quorum
+info     20    node1/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node2 on
+info     20    node2/crm: status change startup => wait_for_quorum
+info     20    node2/lrm: status change startup => wait_for_agent_lock
+info     20      cmdlist: execute power node3 on
+info     20    node3/crm: status change startup => wait_for_quorum
+info     20    node3/lrm: status change startup => wait_for_agent_lock
+info     20    node1/crm: got lock 'ha_manager_lock'
+info     20    node1/crm: status change wait_for_quorum => master
+info     20    node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info     20    node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info     20    node1/crm: adding new service 'vm:103' on node 'node3'
+info     22    node2/crm: status change wait_for_quorum => slave
+info     24    node3/crm: status change wait_for_quorum => slave
+info     25    node3/lrm: got lock 'ha_agent_node3_lock'
+info     25    node3/lrm: status change wait_for_agent_lock => active
+info     25    node3/lrm: starting service vm:103
+info     25    node3/lrm: service status vm:103 started
+info    120      cmdlist: execute service vm:103 lock
+info    220      cmdlist: execute network node3 off
+info    220    node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info    224    node3/crm: status change slave => wait_for_quorum
+info    225    node3/lrm: status change active => lost_agent_lock
+info    260    node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info    260    node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+info    266     watchdog: execute power node3 off
+info    265    node3/crm: killed by poweroff
+info    266    node3/lrm: killed by poweroff
+info    266     hardware: server 'node3' stopped by poweroff (watchdog)
+info    340    node1/crm: got lock 'ha_agent_node3_lock'
+info    340    node1/crm: fencing: acknowleged - got agent lock for node 'node3'
+info    340    node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+info    340    node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node1'
+warn    340    node1/crm: removed leftover lock 'backup' from recovered service 'vm:103' to allow its start.
+info    340    node1/crm: service 'vm:103': state changed from 'fence' to 'started'  (node = node1)
+info    341    node1/lrm: got lock 'ha_agent_node1_lock'
+info    341    node1/lrm: status change wait_for_agent_lock => active
+info    341    node1/lrm: starting service vm:103
+info    341    node1/lrm: service status vm:103 started
+info    820     hardware: exit simulation - done
-- 
2.1.4





More information about the pve-devel mailing list