[pve-devel] [PATCH] migrate : add nocheck for resume
Alexandre DERUMIER
aderumier at odiso.com
Wed Oct 14 19:07:49 CEST 2015
I have made test, with a loop of move file each second,
and monitor the time between source and target.
the results are between 10ms and 300ms, with spikes up to 1s,
so this can explain the race.
(I can't explain the speed difference and spike)
another problem,
I also have hitted the bug again, and just after, I can't migrate the vm anymore,
the HA migrate task start, but after that, the migrate task don't occur.
pve-ha-crm log flood me in loop:
Oct 14 19:01:16 kvmtest1 pve-ha-crm[3819]: service 'vm:125': state changed from 'migrate' to 'started' (node = kvmtest2)
Oct 14 19:01:16 kvmtest1 pve-ha-crm[3819]: migrate service 'vm:125' to node 'kvmtest1' (running)
Oct 14 19:01:16 kvmtest1 pve-ha-crm[3819]: service 'vm:125': state changed from 'started' to 'migrate' (node = kvmtest2, target = kvmtest1)
Oct 14 19:01:26 kvmtest1 pve-ha-crm[3819]: service 'vm:125' - migration failed (exit code 255)
Oct 14 19:01:26 kvmtest1 pve-ha-crm[3819]: service 'vm:125': state changed from 'migrate' to 'started' (node = kvmtest2)
Oct 14 19:01:26 kvmtest1 pve-ha-crm[3819]: migrate service 'vm:125' to node 'kvmtest1' (running)
Oct 14 19:01:26 kvmtest1 pve-ha-crm[3819]: service 'vm:125': state changed from 'started' to 'migrate' (node = kvmtest2, target = kvmtest1)
Oct 14 19:04:33 kvmtest2 pve-ha-lrm[28430]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:04:43 kvmtest2 pve-ha-lrm[28451]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:04:53 kvmtest2 pve-ha-lrm[28472]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:05:03 kvmtest2 pve-ha-lrm[28493]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:05:13 kvmtest2 pve-ha-lrm[28520]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:05:23 kvmtest2 pve-ha-lrm[28541]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:05:33 kvmtest2 pve-ha-lrm[28562]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:05:43 kvmtest2 pve-ha-lrm[28583]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:05:53 kvmtest2 pve-ha-lrm[28604]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
Oct 14 19:06:03 kvmtest2 pve-ha-lrm[28626]: service 'vm:125' not on this node at /usr/share/perl5/PVE/HA/Env/PVE2.pm line 389.
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "dietmar" <dietmar at proxmox.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 14 Octobre 2015 16:17:24
Objet: Re: [pve-devel] [PATCH] migrate : add nocheck for resume
>>To be sure, I would also test with my direct_io patch for fuse...
yes, I'm currently using it.
I have make a simple perl script which monitor create/delete vm conf file,
and time are indeed correct vs notify
node1
-----
exist 20151014 16:14:06.183
notexist20151014 16:14:38.989
exist20151014 16:15:07.066
node2
-----
notexist2 0151014 16:14:06.208
exist 20151014 16:14:39.003
notexist 20151014 16:15:07.089
I'll try to reproduce the problem and compare time again
----- Mail original -----
De: "dietmar" <dietmar at proxmox.com>
À: "aderumier" <aderumier at odiso.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 14 Octobre 2015 16:00:28
Objet: Re: [pve-devel] [PATCH] migrate : add nocheck for resume
> http://search.cpan.org/~andya/File-Monitor-1.00/lib/File/Monitor.pm
>
> which used stat() to detect changes
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list