[pve-devel] [PATCH pve-ha-manager] fix race condition on lrm_status read/write
Dietmar Maurer
dietmar at proxmox.com
Mon Oct 12 12:17:29 CEST 2015
While I can see the bug, the fix looks incorrect too me.
PVE::Tools::file_set_contents is atomic, so it never deletes the file.
So maybe this is a bug inside pmxcfs. I would be great to have a test case
which triggers that bug.
On 10/12/2015 10:36 AM, Thomas Lamprecht wrote:
> Reading and writing the lrm_status files need to be atomic as the
> LRM, CRM, and the API have multiple read/writes on it.
> Use the HA lock domain, as the status of this files may have an
> effect on the whole HA stack.
>
> This was the supposed cause of the bug #758, where an error appeared
> casually in the masteres syslog, it looked like:
> got unexpected error - can't open '/etc/pve/nodes/<node>/lrm_status'
> - No such file or directory
> and was caused when trying to read the lrm_status file during a
> write cycle.
>
> Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
> ---
> src/PVE/HA/Config.pm | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
> index e54a039..f0aadf1 100644
> --- a/src/PVE/HA/Config.pm
> +++ b/src/PVE/HA/Config.pm
> @@ -54,9 +54,12 @@ sub read_lrm_status {
>
> die "undefined node" if !defined($node);
>
> - my $filename = "/etc/pve/nodes/$node/lrm_status";
> + my $code = sub {
> + my $filename = "/etc/pve/nodes/$node/lrm_status";
> + return PVE::HA::Tools::read_json_from_file($filename, {});
> + };
>
> - return PVE::HA::Tools::read_json_from_file($filename, {});
> + return lock_ha_domain($code);
> }
>
> sub write_lrm_status {
> @@ -64,9 +67,12 @@ sub write_lrm_status {
>
> die "undefined node" if !defined($node);
>
> - my $filename = "/etc/pve/nodes/$node/lrm_status";
> + my $code = sub {
> + my $filename = "/etc/pve/nodes/$node/lrm_status";
> + PVE::HA::Tools::write_json_to_file($filename, $status_obj);
> + };
>
> - PVE::HA::Tools::write_json_to_file($filename, $status_obj);
> + return lock_ha_domain($code);
> }
>
> sub parse_groups_config {
More information about the pve-devel
mailing list