[pve-devel] [PATCH pve-ha-manager] fix race condition on lrm_status read/write

Thomas Lamprecht t.lamprecht at proxmox.com
Mon Oct 12 10:36:52 CEST 2015


Reading and writing the lrm_status files need to be atomic as the
LRM, CRM, and the API have multiple read/writes on it.
Use the HA lock domain, as the status of this files may have an
effect on the whole HA stack.

This was the supposed cause of the bug #758, where an error appeared
casually in the masteres syslog, it looked like:
got unexpected error - can't open '/etc/pve/nodes/<node>/lrm_status'
- No such file or directory
and was caused when trying to read the lrm_status file during a
write cycle.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---
 src/PVE/HA/Config.pm | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index e54a039..f0aadf1 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -54,9 +54,12 @@ sub read_lrm_status {
 
     die "undefined node" if !defined($node);
 
-    my $filename = "/etc/pve/nodes/$node/lrm_status";
+    my $code = sub {
+	my $filename = "/etc/pve/nodes/$node/lrm_status";
+	return PVE::HA::Tools::read_json_from_file($filename, {});
+    };
 
-    return PVE::HA::Tools::read_json_from_file($filename, {});  
+    return lock_ha_domain($code);
 }
 
 sub write_lrm_status {
@@ -64,9 +67,12 @@ sub write_lrm_status {
 
     die "undefined node" if !defined($node);
 
-    my $filename = "/etc/pve/nodes/$node/lrm_status";
+    my $code = sub {
+	my $filename = "/etc/pve/nodes/$node/lrm_status";
+	PVE::HA::Tools::write_json_to_file($filename, $status_obj);
+    };
 
-    PVE::HA::Tools::write_json_to_file($filename, $status_obj); 
+    return lock_ha_domain($code);
 }
 
 sub parse_groups_config {
-- 
2.1.4





More information about the pve-devel mailing list