[pve-devel] [PATCH v4 ha-manager 1/3] cleanup backup & mounted locks after recovery (fixes #1100)
Thomas Lamprecht
t.lamprecht at proxmox.com
Thu Sep 15 10:45:15 CEST 2016
This cleans up the backup and mounted locks after a service is
recovered from a failed node, else it may not start if a locked
action occurred during the node failure.
We allow deletion of backup and mounted locks as they are secure
to do so if the node which hosted the locked service is now fenced.
We do not allow snapshot lock deletion as this needs more manual
clean up, also they are normally triggered manually.
Further ignore migration locks for now, we should over think that but
as its a manual triggered action for now it should be OK to not
auto delete it.
We do cannot remove locks via the remove_lock method provided by
PVE::AbstractConfig, as this method is well behaved and does not
allows removing locks from VMs/CTs located on another node. We also
do not want to adapt this method to allow arbitrary lock removing,
independent on which node the config is located, as this could cause
missuse in the future. After all one of our base principles is that
the node owns its VMs/CTs (and there configs) and only the owner
itself may change the status of a VM/CT.
The HA manager needs to be able to change the state of services
when a node failed and is also allowed to do so, but only if the
node is fenced and we need to recover a service from it.
So we (re)implement the remove lock functionality in the resource
plugins.
We call that only if a node was fenced, and only *previous* stealing
the service. After all our implication to remove a lock is that the
owner (the node) is fenced. After stealing it we already changed
owner, and the new owner is *not* fenced and thus our implication
does not hold anymore - the new owner may already do some stuff
with the service (config changes, etc.)
Add the respective log.expect output from the added test to enable
regression testing this issue.
Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---
changes since v3:
* rebased on current master
src/PVE/HA/Manager.pm | 22 ++++++++++++++++
src/PVE/HA/Resources.pm | 6 +++++
src/PVE/HA/Resources/PVECT.pm | 23 +++++++++++++++++
src/PVE/HA/Resources/PVEVM.pm | 23 +++++++++++++++++
src/PVE/HA/Sim/Resources.pm | 15 +++++++++++
src/test/test-locked-service1/log.expect | 44 ++++++++++++++++++++++++++++++++
6 files changed, 133 insertions(+)
create mode 100644 src/test/test-locked-service1/log.expect
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index c60df7c..e6dab7a 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -238,6 +238,26 @@ my $change_service_state = sub {
" to '${new_state}'$text_state");
};
+# clean up a possible bad state from a recovered service to allow its start
+my $fence_recovery_cleanup = sub {
+ my ($self, $sid, $fenced_node) = @_;
+
+ my $haenv = $self->{haenv};
+
+ my (undef, $type, $id) = PVE::HA::Tools::parse_sid($sid);
+ my $plugin = PVE::HA::Resources->lookup($type);
+
+ # should not happen
+ die "unknown resource type '$type'" if !$plugin;
+
+ # locks may block recovery, cleanup those which are safe to remove after fencing
+ my $removable_locks = ['backup', 'mounted'];
+ if (my $removed_lock = $plugin->remove_locks($haenv, $id, $removable_locks, $fenced_node)) {
+ $haenv->log('warning', "removed leftover lock '$removed_lock' from recovered " .
+ "service '$sid' to allow its start.");
+ }
+};
+
# after a node was fenced this recovers the service to a new node
my $recover_fenced_service = sub {
my ($self, $sid, $cd) = @_;
@@ -264,6 +284,8 @@ my $recover_fenced_service = sub {
$haenv->log('info', "recover service '$sid' from fenced node " .
"'$fenced_node' to node '$recovery_node'");
+ &$fence_recovery_cleanup($self, $sid, $fenced_node);
+
$haenv->steal_service($sid, $sd->{node}, $recovery_node);
# $sd *is normally read-only*, fencing is the exception
diff --git a/src/PVE/HA/Resources.pm b/src/PVE/HA/Resources.pm
index 3836fc8..96d2f8f 100644
--- a/src/PVE/HA/Resources.pm
+++ b/src/PVE/HA/Resources.pm
@@ -124,6 +124,12 @@ sub check_running {
die "implement in subclass";
}
+sub remove_locks {
+ my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+ die "implement in subclass";
+}
+
# package PVE::HA::Resources::IPAddr;
diff --git a/src/PVE/HA/Resources/PVECT.pm b/src/PVE/HA/Resources/PVECT.pm
index b6ebe2f..d1312ab 100644
--- a/src/PVE/HA/Resources/PVECT.pm
+++ b/src/PVE/HA/Resources/PVECT.pm
@@ -114,4 +114,27 @@ sub check_running {
return PVE::LXC::check_running($vmid);
}
+sub remove_locks {
+ my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+ $service_node = $service_node || $haenv->nodename();
+
+ my $conf = PVE::LXC::Config->load_config($id, $service_node);
+
+ return undef if !defined($conf->{lock});
+
+ foreach my $lock (@$locks) {
+ if ($conf->{lock} eq $lock) {
+ delete $conf->{lock};
+
+ my $cfspath = PVE::LXC::Config->cfs_config_path($id, $service_node);
+ PVE::Cluster::cfs_write_file($cfspath, $conf);
+
+ return $lock;
+ }
+ }
+
+ return undef;
+}
+
1;
diff --git a/src/PVE/HA/Resources/PVEVM.pm b/src/PVE/HA/Resources/PVEVM.pm
index 4c06df9..55d4368 100644
--- a/src/PVE/HA/Resources/PVEVM.pm
+++ b/src/PVE/HA/Resources/PVEVM.pm
@@ -116,4 +116,27 @@ sub check_running {
return PVE::QemuServer::check_running($vmid, 1, $nodename);
}
+sub remove_locks {
+ my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+ $service_node = $service_node || $haenv->nodename();
+
+ my $conf = PVE::QemuConfig->load_config($id, $service_node);
+
+ return undef if !defined($conf->{lock});
+
+ foreach my $lock (@$locks) {
+ if ($conf->{lock} eq $lock) {
+ delete $conf->{lock};
+
+ my $cfspath = PVE::QemuConfig->cfs_config_path($id, $service_node);
+ PVE::Cluster::cfs_write_file($cfspath, $conf);
+
+ return $lock;
+ }
+ }
+
+ return undef;
+}
+
1;
diff --git a/src/PVE/HA/Sim/Resources.pm b/src/PVE/HA/Sim/Resources.pm
index fe82332..bccc0e6 100644
--- a/src/PVE/HA/Sim/Resources.pm
+++ b/src/PVE/HA/Sim/Resources.pm
@@ -124,4 +124,19 @@ sub migrate {
}
+sub remove_locks {
+ my ($self, $haenv, $id, $locks, $service_node) = @_;
+
+ my $sid = $self->type() . ":$id";
+ my $hardware = $haenv->hardware();
+
+ foreach my $lock (@$locks) {
+ if (my $removed_lock = $hardware->unlock_service($sid, $lock)) {
+ return $removed_lock;
+ }
+ }
+
+ return undef;
+}
+
1;
diff --git a/src/test/test-locked-service1/log.expect b/src/test/test-locked-service1/log.expect
new file mode 100644
index 0000000..6bc04a1
--- /dev/null
+++ b/src/test/test-locked-service1/log.expect
@@ -0,0 +1,44 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:103' on node 'node3'
+info 22 node2/crm: status change wait_for_quorum => slave
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:103
+info 25 node3/lrm: service status vm:103 started
+info 120 cmdlist: execute service vm:103 lock
+info 220 cmdlist: execute network node3 off
+info 220 node1/crm: node 'node3': state changed from 'online' => 'unknown'
+info 224 node3/crm: status change slave => wait_for_quorum
+info 225 node3/lrm: status change active => lost_agent_lock
+info 260 node1/crm: service 'vm:103': state changed from 'started' to 'fence'
+info 260 node1/crm: node 'node3': state changed from 'unknown' => 'fence'
+info 266 watchdog: execute power node3 off
+info 265 node3/crm: killed by poweroff
+info 266 node3/lrm: killed by poweroff
+info 266 hardware: server 'node3' stopped by poweroff (watchdog)
+info 340 node1/crm: got lock 'ha_agent_node3_lock'
+info 340 node1/crm: fencing: acknowleged - got agent lock for node 'node3'
+info 340 node1/crm: node 'node3': state changed from 'fence' => 'unknown'
+info 340 node1/crm: recover service 'vm:103' from fenced node 'node3' to node 'node1'
+warn 340 node1/crm: removed leftover lock 'backup' from recovered service 'vm:103' to allow its start.
+info 340 node1/crm: service 'vm:103': state changed from 'fence' to 'started' (node = node1)
+info 341 node1/lrm: got lock 'ha_agent_node1_lock'
+info 341 node1/lrm: status change wait_for_agent_lock => active
+info 341 node1/lrm: starting service vm:103
+info 341 node1/lrm: service status vm:103 started
+info 820 hardware: exit simulation - done
--
2.1.4
More information about the pve-devel
mailing list