[pve-devel] [PATCH pve-zsync] Allow detecting a syncing instance of a job
Fabian Grünbichler
f.gruenbichler at proxmox.com
Tue Oct 1 10:29:00 CEST 2019
On September 30, 2019 12:55 pm, Fabian Ebner wrote:
> Before, the check whether a syncing instance of the same job is already present
> was inside the locked section. This caused cron to continuously spawn new
> instances of pve-zsync on syncs (or rather groups of syncs) taking longer
> than 15 minutes, see [0] in the forum. This patch introduces a new locked
> section for checking the current status and a new 'waiting' status.
> The 'waiting' status is needed to mark jobs which are currently waiting
> for the lock for syncing. So if job A is syncing and job B is waiting for
> the lock then all new instances of job B will see that one instance is
> already scheduled to sync.
>
> [0]: https://forum.proxmox.com/threads/pve-zsync-bug-spawns-endless-cron-processes.58087/
>
> Signed-off-by: Fabian Ebner <f.ebner at proxmox.com>
> ---
> pve-zsync | 25 ++++++++++++++++++++-----
> 1 file changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/pve-zsync b/pve-zsync
> index 425ffa2..90c1bb3 100755
> --- a/pve-zsync
> +++ b/pve-zsync
> @@ -19,6 +19,7 @@ my $PVE_DIR = "/etc/pve/local";
> my $QEMU_CONF = "${PVE_DIR}/qemu-server";
> my $LXC_CONF = "${PVE_DIR}/lxc";
> my $LOCKFILE = "$CONFIG_PATH/${PROGNAME}.lock";
> +my $LOCKFILE_STATUS_CHECK = "$CONFIG_PATH/${PROGNAME}_status_check.lock";
> my $PROG_PATH = "$PATH/${PROGNAME}";
> my $INTERVAL = 15;
> my $DEBUG;
> @@ -578,20 +579,34 @@ sub destroy_job {
> sub sync {
> my ($param) = @_;
>
> + my $lock_status_check_fh = IO::File->new("> $LOCKFILE_STATUS_CHECK");
> + die "Can't open Lock File: $LOCKFILE_STATUS_CHECK $!\n" if !$lock_status_check_fh;
> + lock($lock_status_check_fh);
> +
> + my $job;
> + eval {
> + $job = get_job($param);
> + };
> +
> + if ($job && defined($job->{state}) && ($job->{state} eq "syncing" || $job->{state} eq "waiting")) {
> + unlock($lock_status_check_fh);
> + die "Job --source $param->{source} --name $param->{name} is already scheduled to sync\n";
> + }
> +
> + $job->{state} = "waiting";
> + update_state($job);
> + unlock($lock_status_check_fh);
I don't think we need a (new) lock here - it would only protect against
other processes entering sync(), but they could at worst do the same
change (again)? update_state() already has locking to prevent races
around the modification itself, and this new lock does not protect
against other read-modify-write cycles (like destroy_job, enable_job,
disable_job, but also init which just does write ;)).
if we introduce a new lock, it should be to guard all state r-m-w cycles
against races with eachother (e.g., by implementing Thomas locked()
suggestion, and pulling the lock from update_state there and updating
all call sites). the other locks (sync, update_cron) can also re-use
this locked mechanism, by passing in some lock identifier/filename
(you can also have wrappers around locked, e.g. lock_state, lock_cron
and lock_sync).
> +
> my $lock_fh = IO::File->new("> $LOCKFILE");
> die "Can't open Lock File: $LOCKFILE $!\n" if !$lock_fh;
> lock($lock_fh);
>
> + #job might've changed while we waited for the lock, but we can be sure it's not syncing
> my $date = get_date();
> - my $job;
> eval {
> $job = get_job($param);
> };
>
> - if ($job && defined($job->{state}) && $job->{state} eq "syncing") {
> - die "Job --source $param->{source} --name $param->{name} is syncing at the moment";
> - }
> -
> my $dest = parse_target($param->{dest});
> my $source = parse_target($param->{source});
>
> --
> 2.20.1
>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
>
More information about the pve-devel
mailing list